python - Multiple regex replacements based on lists in multiple files -


i have folder multiple text files inside need process , format using multiple replacement lists looking this:

old string1~new string1 old string2~new string2 etc~blah 

i run each replacement pair replacement lists on each line of multiple text files. have set of python scripts perform operation. wonder make code simpler , better maintainable if switch sed or awk? better solution or should better improve python code? ask because incoming text files come on regular basis , have little different structure before, mistakes, misspellings, multiple spaces, these files being created humans. have tweak code , replacement lists make work properly. thanks.

unless python code bad, not switching awk make more maintainable. said, it's pretty simple in awk, not scale well:

cat replacement-list-files* | awk 'filename == "-" {    split( $0, a, "~" ); repl[ a[1] ] = a[2]; next }   { for( in repl ) gsub( i, repl[i] ) }1' - input-file 

note works on 1 file @ time. replace 1 { print > ( filename ".new" ) } work on multiple files, have deal closing files if want work on large number of files, , becomes unmaintainable mess. stick python if have working solution.


Comments

Popular posts from this blog

monitor web browser programmatically in Android? -

Shrink a YouTube video to responsive width -

wpf - PdfWriter.GetInstance throws System.NullReferenceException -