简体   繁体   English

从文本文件中创建列表并比较列表

[英]make list from text file and compare the lists

The full.txt contains: full.txt包含:

www.example.com/a.jpg
www.example.com/b.jpg
www.example.com/k.jpg
www.example.com/n.jpg
www.example.com/x.jpg

The partial.txt contains: partial.txt包含:

a.jpg
k.jpg

Why the following code does not provide the desired result? 为什么以下代码无法提供预期的结果?

with open ('full.txt', 'r') as infile:
        lines_full=[line for line in infile]

with open ('partial.txt', 'r') as infile:
    lines_partial=[line for line in infile]    

with open ('remaining.txt', 'w') as outfile:
    for element in lines_full:
        if element[16:21] not in lines_partial: #element[16:21] means like a.jpg
            outfile.write (element)  

The desired remaining.txt should have those elements of full.txt that are not in partial.txt exactly as follows: 所需的剩余.txt应具有完全位于部分.txt中的完全.txt元素,如下所示:

www.example.com/b.jpg
www.example.com/n.jpg
www.example.com/x.jpg

This code will include the newline character at the end of each line, which means it will never match "a.jpg" or "k.jpg" precisely. 此代码将在每行末尾包含换行符,这意味着它将永远不会与"a.jpg""k.jpg"精确匹配。

with open ('partial.txt', 'r') as infile:
    lines_partial=[line for line in infile]

Change it to 更改为

with open ('partial.txt', 'r') as infile:
    lines_partial=[line[:-1] for line in infile]

to get rid of the newline characters ( line[:-1] means "without the last character of the line") 删除换行符( line[:-1]意思是“没有该行的最后一个字符”)

you can use os.path library: 您可以使用os.path库:

from os import path

with open ('full.txt', 'r') as f:
    lines_full = f.read().splitlines()

with open ('partial.txt', 'r') as f:
    lines_partial = set(f.read().splitlines())  # create set for faster checking

lines_new = [x + '\n' for x in lines_full if path.split(x)[1] not in lines_partial]

with open('remaining.txt', 'w') as f:
    f.writelines(lines_new)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM