简体   繁体   中英

How to write differences in files to new file in Python?

As part of a larger project I need to create files based on their matching and different elements. Code sample is below:

with open('TestFile1.csv', 'r') as file_1:
    with open('TestFile2.csv', 'r') as file_2:
        same = set(file_1).intersection(file_2)
        different = set(file_1).difference(file_2)

same.discard('\n')

with open('output_file_same.txt', 'w') as file_out_1:
    for line in same:
        file_out_1.write(line)

with open('output_file_different.txt', 'w') as file_out_2:
    for line in different:
        file_out_2.write(line)

The lines comparing and writing the same lines to a file work well but the code that is supposed to return a file with the different lines returns a blank file. It should return a file with the different lines. Any suggestions?

file_1 and file_2 are file objects, which means they're iterators ; an iterator can be iterated exactly once, after which attempts to iterate it again read nothing. So when you do:

same = set(file_1).intersection(file_2)

it empties both file_1 and file_2 , so:

different = set(file_1).difference(file_2)

behaves roughly the same as set([]).difference([]) . To fix, make sure to slurp the data once up front, then reuse it, eg:

with open('TestFile1.csv', 'r') as file_1, open('TestFile2.csv', 'r') as file_2:
    file_1 = set(file_1)  # slurp to reusable set
    file_2 = set(file_2)  # slurp to reusable set
# Can be done outside the with block, since files no longer needed
same = file_1.intersection(file_2)     # Or: same = file_1 & file_2
different = file_1.difference(file_2)  # Or: different = file_1 - file_2

Side-note: You don't need explicit loops to write out the results;

for line in same:
    file_out_1.write(line)

can be simplified to:

file_out_1.writelines(same)

which runs faster, as well as being simpler.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM