简体   繁体   中英

Python : Compare two files

I have two input file:

scandinavian t airline airline
one n 0 flightnumber
six n 0 flightnumber
two n 0 flightnumber
three n 0 flightnumber

speedbird t airline airline
one n 0 flightnumber
six n 0 flightnumber
eight n 0 flightnumber

My second input file:

scandinavian t airline airli
one n 0 flightnumber
six n 0 flightnumber
two n 0 flightnumber
three n 0 flightnumber

six n 0 flightnumber
eight n 0 flightnumber

I have the following code:

with open('output_ref.txt', 'r') as file1:
with open('output_ref1.txt', 'r') as file2:
same = set(file1).difference(file2)
print same
print "\n"

same.discard('\n')

with open('some_output_file.txt', 'w') as FO:
for line in same:
    FO.write(line)

And I am getting output as:

scandinavian t airline airline
speedbird t airline airline

But my actual output should be:

scandinavian t airline airline
speedbird t airline airline
one n 0 flightnumber

Can someone help me in solving the issue??

First of all, if what you are trying to do is get the common lines from 2 file (which the "same" variable name suggests) , then you should use the intersection method instead of difference . Also , both these methods are stated to require sets as their arguments so i would go the extra step and turn the second file into a set too . So the new code should be:

 first = set(file1)
 second = set(file2)
 same = first.intersection(second)

.....

EDIT :

reading some comments to my post convinced me that you actually want the difference and not on sets, but on lists . I guess this should work for you :

difference = list(file1)
second = list(file2)
for line in second:
    try:
        first.remove(line)
    except ValueError,e:
        print e # alternately you could just pass here
def diff(a, b):
    y = []
    for x in a:
        if x not in b:
            y.append(x)
        else:
            b.remove(x)
    return y

with open('output_ref.txt', 'r') as file1:
    with open('output_ref1.txt', 'r') as file2:
        same = diff(list(file1), list(file2))
        print same
        print "\n"

if '\n' in same:
    same.remove('\n')

with open('some_output_file.txt', 'w') as FO:
    for line in same:
        FO.write(line)
$ python compare.py
['scandinavian t airline airline\n', 'speedbird t airline airline\n', 'one n 0 flightnumber\n']



$ cat some_output_file.txt 
scandinavian t airline airline
speedbird t airline airline
one n 0 flightnumber

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM