简体   繁体   中英

Finding different lines between 2 HTML files

I want to find the difference between 2 txt files that contain HTML code, tried the difflib module but delta stays 0 no matter what I try. I need to find how many lines are different from the total HTML code

thanks!

import difflib
count = 0
count2 = 0
delta = 0 
f = open('C\html1.txt', 'r')
f2 = open('C\html2.txt', 'r')
for i in f2:
    count2 += 1
for i in f:
    count += 1
diff = difflib.udiff = difflib.unified_diff(
            f.readlines(),
           f2.readlines(),
           fromfile='C\html1.txt',
            tofile='C\html2.txt',
       )
for line in diff:
    delta +=1
print delta

print count
per = (delta * 100) / count

The problem is as @wondercricket pointed out that the file pointers are already at EOF.

One way to solve this would be to call

f.seek(0) and f2.seek(0) to move the file pointer to the beginning of file before calculating the diff

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM