As part of a larger project I need to create files based on their matching and different elements. Code sample is below:
with open('TestFile1.csv', 'r') as file_1:
with open('TestFile2.csv', 'r') as file_2:
same = set(file_1).intersection(file_2)
different = set(file_1).difference(file_2)
same.discard('\n')
with open('output_file_same.txt', 'w') as file_out_1:
for line in same:
file_out_1.write(line)
with open('output_file_different.txt', 'w') as file_out_2:
for line in different:
file_out_2.write(line)
The lines comparing and writing the same lines to a file work well but the code that is supposed to return a file with the different lines returns a blank file. It should return a file with the different lines. Any suggestions?
file_1
and file_2
are file objects, which means they're iterators ; an iterator can be iterated exactly once, after which attempts to iterate it again read nothing. So when you do:
same = set(file_1).intersection(file_2)
it empties both file_1
and file_2
, so:
different = set(file_1).difference(file_2)
behaves roughly the same as set([]).difference([])
. To fix, make sure to slurp the data once up front, then reuse it, eg:
with open('TestFile1.csv', 'r') as file_1, open('TestFile2.csv', 'r') as file_2:
file_1 = set(file_1) # slurp to reusable set
file_2 = set(file_2) # slurp to reusable set
# Can be done outside the with block, since files no longer needed
same = file_1.intersection(file_2) # Or: same = file_1 & file_2
different = file_1.difference(file_2) # Or: different = file_1 - file_2
Side-note: You don't need explicit loops to write out the results;
for line in same:
file_out_1.write(line)
can be simplified to:
file_out_1.writelines(same)
which runs faster, as well as being simpler.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.