[英]Python regular expression to replace strings in csv
I have a csv file formatted as below: 我有一个csv文件,格式如下:
cat, mammal[1]
shark, fish[2]
dog, mammal[3]
tiger, mammal[4]
salmon, fish[5]
I would like to replace all the rows containing mammal along with the square brackets. 我想替换所有包含哺乳动物的行以及方括号。
The output should be as follows: 输出应如下所示:
cat, mam
shark, fish[2]
dog, mam
tiger, mam
salmon, fish[5]
so far I have a code to read/write the csv file: 到目前为止,我有一个代码来读取/写入csv文件:
import csv
with open('animals.csv', 'r') as fin, open("out.csv",'w') as fout:
writer = csv.writer(fout)
for row in csv.reader(fin):
re.sub(???) #stuck at writing the regular expression
writer.writerow(row)
You can use the following regex for your replacement: 您可以使用以下正则表达式进行替换:
for row in csv.reader(fin):
row[1] = re.sub(r'(\s*mam)mal\[\d+\]', '\1', row[1])
writer.writerow(row)
See demonstration . 见示范 。
No need for regex here: 这里不需要正则表达式:
for row in csv.reader(fin):
if row[1].startswith("mammal["):
row[1] = "mam"
Performance wise it's best because 性能上最好,因为
or even faster with a generator comprehension and writerows
: 甚至通过生成器理解和
writerows
:
with open('animals.csv', 'r') as fin, open("out.csv",'w') as fout:
csv.writer(fout).writerows([row[0],"mam"] if row[1].starswith("mammal[") else row for row in csv.reader(fin))
note: seems that there's a leading space in the second column. 注意:第二列似乎有一个前导空格。 In which case, add a space in front of search/replace strings.
在这种情况下,请在搜索/替换字符串前面添加一个空格。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.