[英]python: replace words in file with words from other file
I have a large text-file in which there are words I want to replace. 我有一个很大的文本文件,其中有一些我想替换的单词。 I put those words in a csv-file, because I'm constantly adding and changing words and do not want to put the words in the python script itself.
我将这些单词放入一个csv文件中,因为我不断添加和更改单词,并且不想将这些单词放在python脚本本身中。 On each line is a word I want to replace, followed by the word I want to replace it with.
在每一行上都有一个我要替换的单词,然后是我要替换为的单词。 Like this:
像这样:
A_old,A_new
another word,another new word
something old,something new
hello,bye
I know how to replace single words in files with python with the string replace function, but I don't know how to do this when the words are listed in a different file. 我知道如何使用字符串替换功能用python替换文件中的单个单词,但是当单词在另一个文件中列出时,我不知道该怎么做。 I tried my best, but I can't wrap my head around how to work with dictionaries/lists/tuples.
我已尽力而为,但我无法集中精力处理字典/列表/元组的工作方式。 I am rather new to python, and until now I managed with examples from around the internet, but this is beyond my capabilities.
我对python相当陌生,直到现在我还是从Internet上使用示例进行管理,但这超出了我的能力范围。 I got all kinds of errors like 'unhashable type: list' and 'expected a character buffer object'.
我遇到了各种错误,例如“ unhashable type:list”和“ expected a character buffer object”。 The last thing I tried was the most succesful in that I didn't get any errors, but then nothing happened either.
我尝试的最后一件事是最成功的事情,因为我没有遇到任何错误,但是也没有任何反应。 This is the code.
这是代码。 I'm sure it's ugly, but I hope it's not entirely hopeless.
我确定这很丑陋,但我希望它并非完全没有希望。
reader = csv.reader(open('words.csv', 'r'))
d = {}
for row in reader:
key, value = row
d[key] = value
newwords = str(d.keys())
oldwords = str(d.values())
with open('new.txt', 'wt') as outfile:
with open('old.txt', 'rt') as infile:
for line in infile:
outfile.write(line.replace(oldwords,newwords))
The reason I am doing this is because I'm working on a cookbook with an ingredient based index, and I don't want an index with both 'carrot' and 'carrots', instead I want to change 'carrot' into 'carrots', and so on for all the other ingredients. 我这样做的原因是因为我正在使用基于成分的索引编写食谱,并且我不希望同时包含“胡萝卜”和“胡萝卜”的索引,而是希望将“胡萝卜”更改为“胡萝卜” ',以此类推。 Thanks a bunch for a nudge in the right direction!
感谢一群人朝着正确的方向前进!
First you make a list of pairs (old_word, new_word) from 'word.csv' : 首先,您从'word.csv'中列出对(old_word,new_word)的列表:
old_new = [i.strip().split(',') for i in open('words.csv')]
Then, you can replace line by line : 然后,您可以逐行替换:
with open('new.txt', 'w') as outfile, open('old.txt') as infile:
for line in infile:
for oldword, newword in old_new:
line = line.replace(oldword, newword)
outfile.write(line)
or in the whole file at once : 或一次在整个文件中:
with open('new.txt', 'w') as outfile, open('old.txt') as infile:
txt = infile.read()
for oldword, newword in old_new:
txt = txt.replace(oldword, newword)
outfile.write(txt)
but you have to replace one word at a time. 但您一次只能替换一个字。
In your code example you read the replacement word pairs into a dictionary, and then into two lists with keys and values. 在您的代码示例中,您将替换单词对读入字典中,然后读入具有键和值的两个列表中。 I'm not sure why.
我不知道为什么。
I propose to read the replacement words into a list of tuples. 我建议将替换词读入元组列表。
with open('words.csv', 'rb') as rep_words:
rep_list = []
for rep_line in rep_words:
rep_list.append(tuple(rep_line.strip().split(',')))
Then you can open the old.txt
and new.txt
files and perform the replacement using a nested for loop 然后,您可以打开
old.txt
和new.txt
文件,并使用嵌套的for循环执行替换
with open('old.txt', 'rb') as old_text:
with open('new.txt', 'wb') as new_text:
for read_line in old_text:
new_line = read_line
for old_word, new in rep_list:
new_line = new_line.replace(old_word, new_word))
new_text.write(new_line)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.