[英]Tag all English words in multiple text files in same directory
I am trying to modify the code to apply to multiple text files in the same directory.我正在尝试修改代码以应用于同一目录中的多个文本文件。 The code looks as follows but there is an error "NameError: name 'output' is not defined".
代码如下所示,但出现错误“NameError: name 'output' is not defined”。 Can you help me to suggest improvements to the code?
你能帮我提出改进代码的建议吗?
import re
def replaceenglishwords(filename):
mark_pattern = re.compile("\\*CHI:.*")
word_pattern = re.compile("([A-Za-z]+)")
for line in filename:
# Split into possible words
parts = line.split()
if mark_pattern.match(parts[0]) is None:
output.write()
continue
# Got a CHI line
new_line = line
for word in parts[1:]:
matches = word_pattern.match(word)
if matches:
old = f"\\b{word}\\b"
new = f"{matches.group(1)}@s:eng"
new_line = re.sub(old, new, new_line, count=1)
output.write(new_line)
import glob
for file in glob.glob('*.txt'):
outfile = open(file.replace('.txt', '-out.txt'), 'w', encoding='utf8')
for line in open(file, encoding='utf8'):
print(replaceenglishwords(line), '\n', end='', file=outfile)
outfile.close()
replaceenglishwords
needs two parameters, one for the file you are searching and one for the file where you write you results: replaceenglishwords(filename, output)
. replaceenglishwords
需要两个参数,一个用于您正在搜索的文件,另一个用于您写入结果的文件: replaceenglishwords(filename, output)
。 It looks like your function is reading the input file line by line by itself.看起来您的函数正在逐行读取输入文件。
Now you can open both files in your loop and pass them to replaceenglishwords
:现在您可以在循环中打开这两个文件并将它们传递给
replaceenglishwords
:
for file in glob.glob('*.txt'):
textfile = open(file, encoding='utf8')
outfile = open(file.replace('.txt', '-out.txt'), 'w', encoding='utf8')
replaceenglishwords(textfile, outfile)
textfile.close()
outfile.close()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.