繁体   English   中英

如何使用 Python 逐行匹配两个纯文本文件

[英]How do I match two plain text files line by line using Python

根据我的要求,我希望在 Windows 平台上的 Python 中逐行匹配两个文本文件。 例如我有以下文本文件:

文件1:

我叫xxx

命令成功完成。

我妈妈叫yyy

我的手机号码是12345

这辆重型卡车在午夜撞上大楼

货车在学院吃一个红苹果

文件2:

我叫xxx

命令 。 成功地。

我妈妈的名字是

撞到大楼里的卡车是多么沉重

货车在学院吃苹果

我很抱歉不够清楚,所以我的问题是如何将脚本电影与其字幕对齐,我在 Python 中编写了以下代码,但不足以从两个文本文件中获取对齐:

 # Open file for reading in text mode (default mode)
f1 = open('F:/CONTRIBUTION 2017/SCRIPT-SUBTITLES CODES/Script Alignement Papers/f1.txt','r')
f2 = open('F:/CONTRIBUTION 2017/SCRIPT-SUBTITLES CODES/Script Alignement Papers/f2.txt','r')

#Print confirmation
# print("-----------------------------------")
#print("Comparing files ", " > " + fname1, " < " +fname2, sep='\n')
# print("-----------------------------------")

# Read the first line from the files
f1_line = f1.readline()
f2_line = f2.readline()

# Initialize counter for line number
line_no = 1

# Loop if either file1 or file2 has not reached EOF
while f1_line != '' or f2_line != '':

    # Strip the leading whitespaces
    f1_line = f1_line.rstrip()
    f2_line = f2_line.rstrip()

    # Compare the lines from both file
    if f1_line != f2_line:

        # If a line does not exist on file2 then mark the output with + sign
        if f2_line == '' and f1_line != '':
            print("=================================================================")
            print("=================================================================")
            print("line does not exist on File 2 ====================")
            print("=================================================================")
            print(">+", "Line-%d" % line_no, f1_line)
        # otherwise output the line on file1 and mark it with > sign
        elif f1_line != '':

            print("=================================================================")
            print("=================================================================")
            print("otherwise output the line on file1 ====================")
            print("=================================================================")
            print(">", "Line-%d" % line_no, f1_line)

        # If a line does not exist on file1 then mark the output with + sign
        if f1_line == '' and f2_line != '':
            print("=================================================================")
            print("=================================================================")
            print("=line does not exist on File 1 ====================")
            print("=================================================================")
            print("<+", "Line-%d" % line_no, f2_line)
        # otherwise output the line on file2 and mark it with < sign
        elif f2_line != '':
            print("=================================================================")
            print("=================================================================")
            print("otherwise output the line on file2 ====================")
            print("=================================================================")
            print("<", "Line-%d" %  line_no, f2_line)

        # Print a blank line
        print()

    #Read the next line from the file
    f1_line = f1.readline()
    f2_line = f2.readline()


    #Increment line counter
    line_no += 1

# Close the files
f1.close()
f2.close()

如果有人能帮忙做这个匹配,我将不胜感激。

最好发布您尝试编写的代码。 这感觉就像我们在做你的功课,让你看起来很懒惰。 话虽如此,请看以下内容:

with open(file1, 'r') as f1, open(file2, 'r') as f2:
    if f1.readlines() == f2.readlines():
        print('Files {} & {} are identical!'.format(file1, file2))

PS:这会检查文件是否相同 如果你想要一些类似逻辑比较的东西,你必须先做一些研究。

一种可能的方法是将文件的行存储在一个列表中,然后比较这些列表。

lines_of_file1 = []
file = open("file1.txt","r")
line = 'sample'
while line != '':
    line = file.readline()
    lines_of_file1.append(line)
file.close()
lines_of_file2 = []
file = open("file2.txt","r")
line = 'sample'
while line != '':
    line = file.readline()
    lines_of_file2.append(line)
file.close()
same = True
for line1 in lines_of_file1:
     for line2 in lines_of_file2:
        if line1 != line2:
            same = False
            break
if same:
    print("Files are same")
else:
    print("Files are not same")

希望有帮助。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM