Background
I have a two column CSV file like this:
Find | Replace |
---|---|
is | was |
A | one |
b | two |
etc.
First column is text to find and second is text to replace.
I have second file with some text like this:
"This is A paragraph in a text file." (Please note the case sensitivity)
My requirement:
I want to use that csv file to search and replace in the text file with three conditions:-
Script tried:
with open(CSV_file.csv', mode='r') as infile:
reader = csv.reader(infile)
mydict = {(r'\b' + rows[0] + r'\b'): (r'\b' + rows[1]+r'\b') for rows in reader}<--Requires Attention
print(mydict)
with open('find.txt') as infile, open(r'resul_out.txt', 'w') as outfile:
for line in infile:
for src, target in mydict.items():
line = re.sub(src, target, line) <--Requires Attention
# line = line.replace(src, target)
outfile.write(line)
Description of script I have loaded my csv into a python dictionary and use regex to find whole words.
Problems
I used r'\\b' to make word boundry in order to make whole word replacement but output gives me "\\\\b" in the dictionary instead of '\\b' ??
using REPLACE function gives like:
"Thwas was one paragraph in a text file."
secondly I don't know how to make replacement case sensitive in regex pattern?
If anyone know better solution than this script or can improve the script?
Thanks for help if any..
I'd just put pure strings into mydict
so it looks like
{'is': 'was', 'A': 'one', ...}
and replace this line:
# line = re.sub(src, target, line) # old
line = re.sub(r'\b' + src + r'\b', target, line) # new
Note that you don't need \\b
in the replacement pattern. Regarding your other questions,
'\\b'
to '\\\\b' is exactly what the r''
does. You can omit the r
and write '\\\\b'
, but that quickly gets ugly with more complex regexs.Here's a more cumbersome approach (more code) but which is easier to read and does not rely on regular expressions. In fact, given the very simple nature of your CSV control file, I wouldn't normally bother using the csv module at all:-
import csv
with open('temp.csv', newline='') as c:
reader = csv.DictReader(c, delimiter=' ')
D = {}
for row in reader:
D[row['Find']] = row['Replace']
with open('input.txt', newline='') as infile:
with open('output.txt', 'w') as outfile:
for line in infile:
tokens = line.split()
for i, t in enumerate(tokens):
if t in D:
tokens[i] = D[t]
outfile.write(' '.join(tokens)+'\n')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.