[英]How to compare two .csv and .xlsx files and print out mismatched for a particular field?
[英]how to compare two files and print mismatched line number in python?
我有两个包含相同行数的文件。
"file1.txt" contains following lines:
Attitude is a little thing that makes a big difference
The only disability in life is a bad attitude
Abundance is, in large part, an attitude
Smile when it hurts most
"file2.txt" contains:
Attitude is a little thing that makes a big difference
Everyone has his burden. What counts is how you carry it
Abundance is, in large part, an attitude
A positive attitude may not solve all your problems
我想逐行比较两个文件,如果我想比较两个文件之间的任何行不匹配
print "mismatch in line no: 2"
print "mismatch in line no: 4" #in this case lineno: 2 and lineno: 4 varies from second file
我试过了。但我只能打印 file1 中与 file2 中的行不同的行。无法打印不匹配行的行号。??
My code:
with open("file1.txt") as f1:
lineset = set(f1)
with open("file2.txt") as f2:
lineset.difference_update(f2)
for line in lineset:
print line
import itertools
with open('file1.txt') as f1, open('file2.txt') as f2:
for lineno, (line1, line2) in enumerate(itertools.izip(f1, f2), 1):
if line1 != line2:
print 'mismatch in line no:', lineno
如果:
with open("file1.txt") as f1:
with open("file2.txt") as f2:
for idx, (lineA, lineB) in enumerate(zip(f1, f2)):
if lineA != lineB:
print 'mismatch in line no: {0}'.format(idx)
或者,如果行数不同,您可以尝试izip_longest
import itertools
with open("file1.txt") as f1:
with open("file2.txt") as f2:
for idx, (lineA, lineB) in enumerate(itertools.izip_longest(f1, f2)):
if lineA != lineB:
print 'mismatch in line no: {0}'.format(idx)
您也许可以使用difflib
模块。 这是一个使用其difflib.Differ
类的简单示例:
import difflib
import sys
with open('file1.txt') as file1, open('file2.txt') as file2:
line_formatter = '{:3d} {}'.format
file1_lines = [line_formatter(i, line) for i, line in enumerate(file1, 1)]
file2_lines = [line_formatter(i, line) for i, line in enumerate(file2, 1)]
results = difflib.Differ().compare(file1_lines, file2_lines)
sys.stdout.writelines(results)
输出:
1 Attitude is a little thing that makes a big difference
- 2 The only disability in life is a bad attitude
+ 2 Everyone has his burden. What counts is how you carry it
3 Abundance is, in large part, an attitude
- 4 Smile when it hurts most
+ 4 A positive attitude may not solve all your problems
第一列中的减号和加号字符表示以典型diff
实用程序样式替换的行。 没有任何指示符意味着这两个文件中的行是相同的——如果你愿意,你可以禁止打印那些,但为了保持示例简单, compare()
方法创建的所有内容都被打印。
作为参考,以下是并排显示的两个文件的内容,并显示了行号:
1 Attitude is a little thing that makes a big difference Attitude is a little thing that makes a big difference
2 The only disability in life is a bad attitude Everyone has his burden. What counts is how you carry it
3 Abundance is, in large part, an attitude Abundance is, in large part, an attitude
4 Smile when it hurts most A positive attitude may not solve all your problems
import itertools
with open('file1.txt') as f1, open('file2.txt') as f2:
for lineno, (line1, line2) in enumerate(zip(f1, f2), 1):
if line1 != line2:
print ('mismatch in line no:', lineno)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.