简体   繁体   English

Python正则表达式替换csv中的字符串

[英]Python regular expression to replace strings in csv

I have a csv file formatted as below: 我有一个csv文件,格式如下:

   cat, mammal[1]
   shark, fish[2]
   dog, mammal[3]
   tiger, mammal[4]
   salmon, fish[5]

I would like to replace all the rows containing mammal along with the square brackets. 我想替换所有包含哺乳动物的行以及方括号。

The output should be as follows: 输出应如下所示:

cat, mam
shark, fish[2]
dog, mam
tiger, mam
salmon, fish[5]

so far I have a code to read/write the csv file: 到目前为止,我有一个代码来读取/写入csv文件:

import csv


with open('animals.csv', 'r') as fin, open("out.csv",'w') as fout:
        writer = csv.writer(fout)
        for row in csv.reader(fin):
            re.sub(???) #stuck at writing the regular expression
            writer.writerow(row)

You can use the following regex for your replacement: 您可以使用以下正则表达式进行替换:

for row in csv.reader(fin):
    row[1] = re.sub(r'(\s*mam)mal\[\d+\]', '\1', row[1])
    writer.writerow(row)

See demonstration . 示范

No need for regex here: 这里不需要正则表达式:

for row in csv.reader(fin):
    if row[1].startswith("mammal["):
       row[1] = "mam"

Performance wise it's best because 性能上最好,因为

  • no regex 没有正则表达式
  • string replacement only if matches, left unchanged otherwise 仅在匹配时替换字符串,否则保持不变

or even faster with a generator comprehension and writerows : 甚至通过生成器理解和writerows

with open('animals.csv', 'r') as fin, open("out.csv",'w') as fout:
    csv.writer(fout).writerows([row[0],"mam"] if row[1].starswith("mammal[") else row for row in csv.reader(fin))

note: seems that there's a leading space in the second column. 注意:第二列似乎有一个前导空格。 In which case, add a space in front of search/replace strings. 在这种情况下,请在搜索/替换字符串前面添加一个空格。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM