简体   繁体   中英

Python File Matching and Line Printing

I'm trying to expand the below code (thanks to Carles Mitjans) so that if there's a match between file1 and file2 ie 12345 from file2 and 12345 from file1 that it prints the full line including paths for however many matches there are between the two files.

However I can't seem to change the intersection into something that would allow this.

file1 example                    file2 example
12345 /users/test/Desktop        543252 
54321 /users/test/Downloads      12345  
0000  /users/test/Desktop        11111
                                 0000


with open('Results.txt', 'r') as file1:
    with open('test.txt', 'r') as file2:
        a = set(x.split()[0] for x in file1)
        b = [x.rstrip() for x in file2]
        same = a.intersection(b)
        for line in same:
            print line

same.discard('\n')

It currently outputs 12345 0000

Any pointers of guidance would be gratefully received.

Use a dict to map the line to the matching term. Also there is no need to nest the two loops. They can be calculated separately for faster processing.

a = set()
map = {}
with open('Results.txt', 'r') as file1:
    for x in file1:
        splits = x.split()
        a.add(splits[0])
        map[splits[0]] = x

b = []
with open('test.txt', 'r') as file2:
    b = [x.rstrip() for x in file2]

same = a.intersection(b)
same.discard('\n')

for keyItem in same:
    print map[keyItem]   

The above solution as OP noticed, will only print the last match. Changing the map can solve the issue.

map = dict()
with open('Results.txt', 'r') as file1:
    for x in file1:
        splits = x.split()
        a.add(splits[0])

        # If the key is not present, initialize it
        if splits[0] not in map:
            map[splits[0]] = []
        # Append the line to map value
        map[splits[0]].append(x)

Other things will remain the same.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM