用Python逐行比较两个文本文件

Question

I have two text files that I want to compare. 我有两个我要比较的文本文件。 First file contains unique items, and the second file contains same items but repeated numerous times. 第一个文件包含唯一的项目，第二个文件包含相同的项目但重复多次。 I want to see how many times each line is repeated in the second file. 我想看看第二个文件中每行重复多少次。 This is what I wrote: 这就是我写的：

import os
import sys

f1 = open('file1.txt')  # this has the 27 unique lines, 
f1data = f1.readlines()

f2 = open('file2.txt')  # this has lines repeated various times, with a total of 11162 lines
f2data = f2.readlines()

sys.stdout = open("linecount.txt", "w")


for line1 in f1data:
    linecount = 0
    for line2 in f2data:
        if line1 in line2:
        linecount+=1

    print line2, crime

The problem is, when I add up the line count result it returns 11586, instead of 11162. What is the reason for this increase in the line count? 问题是，当我将行计数结果加起来时，它返回11586，而不是11162.这个行计数增加的原因是什么？

Is there another way of getting a line frequency output using Python? 有没有其他方法可以使用Python获得线路频率输出？

Answer 1

https://docs.python.org/2.7/reference/expressions.html#in : https://docs.python.org/2.7/reference/expressions.html#in ：

For the Unicode and string types, x in y is true if and only if x is a substring of y . 对于Unicode和字符串类型，当且仅当x是y的子字符串时， x in y为真。

Instead of 代替

    if line1 in line2:

I think you meant to write 我想你打算写

    if line1 == line2:

Or maybe replace the whole 或者可以替换整个

for line2 in f2data:
    if line1 in line2:
        linecount+=1

block by 阻止

if line1 in f2data:
    linecount += 1

Answer 2

it is not working even if we change the code a bit. 即使我们稍微更改了代码，它也无法正常工作。 I got some better results from this code. 我从这段代码中得到了更好的结果。

>> import os
>> import sys

>> f1 = open('hmd4.csv')   
>> f2 = open('svm_words.txt')  

>> linecount = 0

>> for word1 in f1.read().split("."):
>>     for word2 in f2.read().split("\n"):
>>         if word1 in word2:
>>             linecount+=1 
>>             print (linecount)

用Python逐行比较两个文本文件

问题描述

2 个解决方案

解决方案1
2 已采纳 2015-11-08 14:28:11

解决方案2
1 2016-03-17 05:52:01

用Python逐行比较两个文本文件

问题描述

2 个解决方案

解决方案1 2 已采纳 2015-11-08 14:28:11

解决方案2 1 2016-03-17 05:52:01

解决方案1
2 已采纳 2015-11-08 14:28:11

解决方案2
1 2016-03-17 05:52:01