[英]Write to file without redundant lines in python
I'm writing python script to read line from a input file and write a unique lines(if the same line is not already in output file) to output file. 我正在编写python脚本以从输入文件中读取行,并将唯一行(如果输出文件中尚未包含同一行)写入输出文件。 somehow, my scripts always append the first line of input file to output file even if the same line is already in output file.
无论如何,即使输出文件中已经存在同一行,我的脚本也总是将输入文件的第一行附加到输出文件中。 I can't figure out why this happens.
我不知道为什么会这样。 can anyone know why and how do I fix this?
谁能知道为什么以及如何解决这个问题? thanks,
谢谢,
import os
input_file= 'input.txt'
output_file = 'output.txt'
fo = open(output_file, 'a+')
flag = False
with open(input_file, 'r') as fi:
for line1 in fi:
print line1
for line2 in fo:
print line2
if line2 == line1:
flag = True
print('Found Match!!')
break
if flag == False:
fo.write(line1)
elif flag == True:
flag == False
fo.seek(0)
fo.close()
fi.close()
When you open a file in append mode, the file object position is at the end of the file. 当您以追加模式打开文件时,文件对象的位置在文件的末尾。 So the first time through, when it reaches
for line2 in fo:
, there aren't any more lines in fo
, so that block is skipped, and flag
is still true, so that first line is written to the output file. 所以,第一次通过,当它到达
for line2 in fo:
有没有任何更多的行fo
,所以该块被跳过, flag
仍然是真实的,所以第一行被写入到输出文件。 After that, you do fo.seek(0)
, so you are checking against the entire file for subsequent lines. 之后,执行
fo.seek(0)
,因此您要对照整个文件检查后续行。
The answer by kmacinnis is right on as to why your code isn't working; kmacinnis 的答案正确地说明了为什么您的代码无法正常工作。 you need to use mode
'r+'
instead of 'a+'
, or else put fo.seek(0)
at the beginning of the for
loop instead of the end. 您需要使用模式
'r+'
代替'a+'
,否则将fo.seek(0)
放在for
循环的开始而不是结尾。
That said, there's a much better way to do this than reading the entire output file for every line of the input file. 就是说,有一种比读取输入文件每一行的整个输出文件更好的方法。
def ensure_file_ends_with_newline(handle):
position = handle.tell()
handle.seek(-1, 2)
handle_end = handle.read(1)
if handle_end != '\n':
handle.write('\n')
handle.seek(position)
input_filepath = 'input.txt'
output_filepath = 'output.txt'
with open(input_file, 'r') as infile, open(output_file, 'r+') as outfile:
ensure_file_ends_with_newline(outfile)
written = set(outfile)
for line in infile:
if line not in written:
outfile.write(line)
written.add(line)
Your flag was never set to False. 您的标志从未设置为False。
flag == True
is an equality flag == True
是一个相等项
flag = True
is an assignment. flag = True
是一个分配。
Try the latter. 尝试后者。
import os
input_file= 'input.txt'
output_file = 'output.txt'
fo = open(output_file, 'a+')
flag = False
with open(input_file, 'r') as fi:
for line1 in fi:
#print line1
for line2 in fo:
#print line2
if line2 == line1:
flag = True
print('Found Match!!')
print (line1,line2)
break
if flag == False:
fo.write(line1)
elif flag == True:
flag = False
fo.seek(0)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.