I'm trying to expand the below code (thanks to Carles Mitjans) so that if there's a match between file1 and file2 ie 12345 from file2 and 12345 from file1 that it prints the full line including paths for however many matches there are between the two files.
However I can't seem to change the intersection into something that would allow this.
file1 example file2 example
12345 /users/test/Desktop 543252
54321 /users/test/Downloads 12345
0000 /users/test/Desktop 11111
0000
with open('Results.txt', 'r') as file1:
with open('test.txt', 'r') as file2:
a = set(x.split()[0] for x in file1)
b = [x.rstrip() for x in file2]
same = a.intersection(b)
for line in same:
print line
same.discard('\n')
It currently outputs 12345 0000
Any pointers of guidance would be gratefully received.
Use a dict
to map the line to the matching term. Also there is no need to nest the two loops. They can be calculated separately for faster processing.
a = set()
map = {}
with open('Results.txt', 'r') as file1:
for x in file1:
splits = x.split()
a.add(splits[0])
map[splits[0]] = x
b = []
with open('test.txt', 'r') as file2:
b = [x.rstrip() for x in file2]
same = a.intersection(b)
same.discard('\n')
for keyItem in same:
print map[keyItem]
The above solution as OP noticed, will only print the last match. Changing the map can solve the issue.
map = dict()
with open('Results.txt', 'r') as file1:
for x in file1:
splits = x.split()
a.add(splits[0])
# If the key is not present, initialize it
if splits[0] not in map:
map[splits[0]] = []
# Append the line to map value
map[splits[0]].append(x)
Other things will remain the same.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.