简体   繁体   English

在python中写入文件时没有多余的行

[英]Write to file without redundant lines in python

I'm writing python script to read line from a input file and write a unique lines(if the same line is not already in output file) to output file. 我正在编写python脚本以从输入文件中读取行,并将唯一行(如果输出文件中尚未包含同一行)写入输出文件。 somehow, my scripts always append the first line of input file to output file even if the same line is already in output file. 无论如何,即使输出文件中已经存在同一行,我的脚本也总是将输入文件的第一行附加到输出文件中。 I can't figure out why this happens. 我不知道为什么会这样。 can anyone know why and how do I fix this? 谁能知道为什么以及如何解决这个问题? thanks, 谢谢,

import  os

input_file= 'input.txt'
output_file = 'output.txt'

fo = open(output_file, 'a+')
flag = False
with open(input_file, 'r') as fi:
    for line1 in fi:
       print line1
       for line2 in fo:
           print line2
           if line2 == line1:
               flag = True
               print('Found Match!!')
               break
       if flag == False:
           fo.write(line1)
       elif flag == True:
           flag == False
       fo.seek(0)
    fo.close()
    fi.close()

When you open a file in append mode, the file object position is at the end of the file. 当您以追加模式打开文件时,文件对象的位置在文件的末尾。 So the first time through, when it reaches for line2 in fo: , there aren't any more lines in fo , so that block is skipped, and flag is still true, so that first line is written to the output file. 所以,第一次通过,当它到达for line2 in fo:有没有任何更多的行fo ,所以该块被跳过, flag仍然是真实的,所以第一行被写入到输出文件。 After that, you do fo.seek(0) , so you are checking against the entire file for subsequent lines. 之后,执行fo.seek(0) ,因此您要对照整个文件检查后续行。

The answer by kmacinnis is right on as to why your code isn't working; kmacinnis 的答案正确地说明了为什么您的代码无法正常工作。 you need to use mode 'r+' instead of 'a+' , or else put fo.seek(0) at the beginning of the for loop instead of the end. 您需要使用模式'r+'代替'a+' ,否则将fo.seek(0)放在for循环的开始而不是结尾。

That said, there's a much better way to do this than reading the entire output file for every line of the input file. 就是说,有一种比读取输入文件每一行的整个输出文件更好的方法。

def ensure_file_ends_with_newline(handle):
    position = handle.tell()

    handle.seek(-1, 2)
    handle_end = handle.read(1)
    if handle_end != '\n':
        handle.write('\n')

    handle.seek(position)


input_filepath = 'input.txt'
output_filepath = 'output.txt'

with open(input_file, 'r') as infile, open(output_file, 'r+') as outfile:
    ensure_file_ends_with_newline(outfile)

    written = set(outfile)

    for line in infile:
        if line not in written:
            outfile.write(line)
            written.add(line)

Your flag was never set to False. 您的标志从未设置为False。

flag == True is an equality flag == True是一个相等项

flag = True is an assignment. flag = True是一个分配。

Try the latter. 尝试后者。

import  os

input_file= 'input.txt'
output_file = 'output.txt'

fo = open(output_file, 'a+')
flag = False
with open(input_file, 'r') as fi:
    for line1 in fi:
       #print line1
       for line2 in fo:
           #print line2
           if line2 == line1:
               flag = True
               print('Found Match!!')
               print (line1,line2)
               break
       if flag == False:
           fo.write(line1)
       elif flag == True:
           flag = False
       fo.seek(0)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM