从文本文件中创建列表并比较列表

Question

The full.txt contains: full.txt包含：

www.example.com/a.jpg
www.example.com/b.jpg
www.example.com/k.jpg
www.example.com/n.jpg
www.example.com/x.jpg

The partial.txt contains: partial.txt包含：

a.jpg
k.jpg

Why the following code does not provide the desired result? 为什么以下代码无法提供预期的结果？

with open ('full.txt', 'r') as infile:
        lines_full=[line for line in infile]

with open ('partial.txt', 'r') as infile:
    lines_partial=[line for line in infile]    

with open ('remaining.txt', 'w') as outfile:
    for element in lines_full:
        if element[16:21] not in lines_partial: #element[16:21] means like a.jpg
            outfile.write (element)

The desired remaining.txt should have those elements of full.txt that are not in partial.txt exactly as follows: 所需的剩余.txt应具有完全位于部分.txt中的完全.txt元素，如下所示：

www.example.com/b.jpg
www.example.com/n.jpg
www.example.com/x.jpg

Answer 1

This code will include the newline character at the end of each line, which means it will never match "a.jpg" or "k.jpg" precisely. 此代码将在每行末尾包含换行符，这意味着它将永远不会与"a.jpg"或"k.jpg"精确匹配。

with open ('partial.txt', 'r') as infile:
    lines_partial=[line for line in infile]

Change it to 更改为

with open ('partial.txt', 'r') as infile:
    lines_partial=[line[:-1] for line in infile]

to get rid of the newline characters ( line[:-1] means "without the last character of the line") 删除换行符（ line[:-1]意思是“没有该行的最后一个字符”）

Answer 2

you can use os.path library: 您可以使用os.path库：

from os import path

with open ('full.txt', 'r') as f:
    lines_full = f.read().splitlines()

with open ('partial.txt', 'r') as f:
    lines_partial = set(f.read().splitlines())  # create set for faster checking

lines_new = [x + '\n' for x in lines_full if path.split(x)[1] not in lines_partial]

with open('remaining.txt', 'w') as f:
    f.writelines(lines_new)

从文本文件中创建列表并比较列表

问题描述

2 个解决方案

解决方案1
1 2013-09-13 06:00:56

解决方案2
1 已采纳 2013-09-13 06:09:27

从文本文件中创建列表并比较列表

问题描述

2 个解决方案

解决方案1 1 2013-09-13 06:00:56

解决方案2 1 已采纳 2013-09-13 06:09:27

解决方案1
1 2013-09-13 06:00:56

解决方案2
1 已采纳 2013-09-13 06:09:27