比较两个.txt文件中的行，打印出不包含单词的新行

Question

I have the following piece of code that, for every line in textfile1, searches textfile2 and if the line is contained in textfile2 prints out the corresponding line of textfile2. 我有以下代码，对于textfile1中的每一行，搜索textfile2，如果该行包含在textfile2中，则会打印出textfile2的相应行。 I want to however, print out new line for every line not contained in textfile2. 但是，我想为textfile2中未包含的每一行打印出新行。 Here is the code: 这是代码：

def readline():
with open("textfile1.txt") as file, open("textfile2.txt") as file2:
    string = set(map(str.rstrip,file))
    for line in file2:
        spl = line.split(None, 1)[0]
        if spl in string:
            print(line.rstrip())    
        else:              ##if spl not in string print new line
            print("\n")

It doesn't work as I expect (doesn't print out any new lines), what may be the problem or any alternative solutions? 它没有按我期望的那样工作（不打印任何新行），可能是什么问题或任何替代解决方案？

Sample Textfile1: 样本文本文件1：

'
a
aa
ab
abandon
abandonaudiofocus
abandonsession
abort
abortablehttprequest
abortanimation
abortcaptures
abortconnection
abortpolicy
abortrequest
abs

Sample Textfile2: 样本文本文件2：

'                |            22624
a                |               91
aa               |                7
ab               |                6
abort            |                8
abortanimation   |                5
abs              |              131
abslistview      |              115
absolutelayout   |               50
absolutesizespan |                6
abstracthttpentity |                2
abstractlist     |                1
abstractmap      |                4
abstractselector |                1
abstractset      |                2

Textfile1 includes many more words and it contains all the words in textfile2. Textfile1包含更多单词，并且包含textfile2中的所有单词。

Answer 1

For every line in textfile2 , searches first part of it in textfile1 and if the line is contained in textfile2 prints out the corresponding line of textfile2 . 对于每一行textfile2 ，搜索它的第一部分在textfile1并且如果线被包含在textfile2打印出的对应线textfile2 。

def readline():
        file1_list = [line.rstrip() for line in open("textfile1.txt")]
        file2_list = [line.rstrip() for line in open("textfile2.txt")]
        fileo_list = [line if line.split(None, 1)[0] in file1_list else '' for line in file2_list]
        for line in fileo_list:
            print(line)

This will print out: 这将打印出：

'                |            22624
a                |               91
aa               |                7
ab               |                6
abort            |                8
abortanimation   |                5
abs              |              131


.....

Answer 2

According to your question - 根据您的问题-

for every line in textfile1, searches textfile2 and if the line is contained in textfile2 prints out the corresponding line of textfile2 对于textfile1中的每一行，搜索textfile2，如果该行包含在textfile2中，则打印出textfile2的相应行

And comment - 并发表评论-

Textfile1 includes many more words and it contains all the words in textfile2 Textfile1包含更多单词，并且包含textfile2中的所有单词

The logic you have right now if actually opposite, it checks for each line in file2 - textfile2.txt - whether that line's first part exists in the file - textfile1.txt - which would always be true, according to your comment. 您现在拥有的逻辑（如果实际上相反）将检查file2每一行textfile2.txt该行的第一部分是否存在于file textfile1.txt根据您的评论，该行始终为真。

You need to get all elements (first part of each line) of file2 in the set and then check each line of file . 您需要获取集合中file2的所有元素（每行的第一部分），然后检查file每一行。 Example - 范例-

def get_first(line):
    return line.split(None, 1)[0]

def readline():
    with open("textfile1.txt",'r') as file, open("textfile2.txt",'r') as file2:
        string = set(map(get_first,file2))
        file2.seek(0)
        file2_dict = {}
        for line in file2:
            file2_dict[line.split(None, 1)[0]] = line
        for line in file:
            if line.strip() in string:
                print(file2_dict[line.rstrip()])    
            else:              ##if spl not in string print new line
                print()

Also, you do not need "\\n" inside your print() in else part, print also puts a newline by itself , you can just do - print() to print a newline. 另外，在print()的其他部分不需要"\\n" ，print本身也会放置换行符，您只需执行print()即可打印换行符。

Example/Demo - 示例/演示-

>>> def get_first(line):
...     return line.split(None, 1)[0]
...
>>> def readline():
...     with open("a.txt",'r') as file, open("b.txt",'r') as file2:
...         string = set(map(get_first,file2))
...         for line in file:
...             if line.strip() in string:
...                 print(line.rstrip())
...             else:              ##if spl not in string print new line
...                 print()
...
>>> readline()
a
aa
ab



abort

abortanimation




abs

In the above example, a.txt contains data from your example textfile1.txt and b.txt contains data from example of textfile2.txt . 在上面的示例中， a.txt包含来自示例textfile1.txt数据， b.txt包含来自textfile2.txt示例的数据。

Answer 3

Sets make this pretty easy 套装使这个变得非常容易

with open("textfile1.txt") as file1:
    textfile_1_set = set(map(str.rstrip, file1))

with open("textfile2.txt") as file2:
    textfile_2_set = set([l.split()[0] for l in file2])

# remove all the lines that are in textfile2 from the 
# set of lines from textfile1
in_1_but_not_2 = textfile_1_set - textfile_2_set

for line in in_1_but_not_2:
    print line

比较两个.txt文件中的行，打印出不包含单词的新行

问题描述

3 个解决方案

解决方案1
1 2015-08-12 12:15:39

解决方案2
0 已采纳 2015-08-12 10:48:28

解决方案3
0 2015-08-13 23:40:50

比较两个.txt文件中的行，打印出不包含单词的新行

问题描述

3 个解决方案

解决方案1 1 2015-08-12 12:15:39

解决方案2 0 已采纳 2015-08-12 10:48:28

解决方案3 0 2015-08-13 23:40:50

解决方案1
1 2015-08-12 12:15:39

解决方案2
0 已采纳 2015-08-12 10:48:28

解决方案3
0 2015-08-13 23:40:50