简体   繁体   中英

Comparing contents of two txt.files for deleted lines or changes in python

I'm trying to compare two.txt files for changes or deleted lines. If its deleted I want to output what the deleted line was and if it was changed I want to output the new line. I originally tried comparing line to line but when something was deleted it wouldn't work for my purpose:

for line1 in f1:
    for line1 in f2:
        if line1==line1:
            print("SAME",file=x)
        else: 
            print(f"Original:{line1} / New:{line1}", file=x)

Then I tried not comparing line to line so I could figure out if something was deleted but I'm not getting any output:

def check_diff(f1,f2):
    check = {}
    for file in [f1,f2]:
        with open(file,'r') as f:
            check[file] = []
            for line in f:
                check[file].append(line)
    diff = set(check[f1]) - set(check[f2])
    for line in diff:
        print(line.rstrip(),file=x)

I tried combining a lot of other questions previously asked similar to my problem to get this far, but I'm new to python so I need a little extra help. Thanks. Please let me know if I need to add any additional information.

The concept is simple. Lets say file1,txt is the original file, and file2 is the one we need to see what changes were made to it:

with open('file1.txt', 'r') as f:
    f1 = f.readlines()
with open('file2.txt', 'r') as f:
    f2 = f.readlines()

deleted = []
added = []

for l in f1:
    if l not in f2:
        deleted.append(l)
for l in f2:
    if l not in f1:
        added.append(l)
        
print('Deleted lines:')
print(''.join(deleted))
print('Added lines:')
print(''.join(added))

For every line in the original file, if that line isn't in the other file, then that means that the line have been deleted. If it's the other way around, that means the line have been added.

I am not sure how you would quantify a changed line (since you could count it as one deleted plus one added line), but perhaps something like the below would be of some aid. Note that if your files are large, it might be faster to store the data in a set instead of a list , since the former has typically a search time complexity of O(1), while the latter has O(n):

with open('file1.txt', 'r') as f1, open('file2.txt', 'r') as f2:
    file1 = set(f1.read().splitlines())
    file2 = set(f2.read().splitlines())

changed_lines = [line for line in file1 if line not in file2]
deleted_lines = [line for line in file2 if line not in file1]

print('Changed lines:\n' + '\n'.join(changed_lines))    
print('Deleted lines:\n' + '\n'.join(deleted_lines))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM