繁体   English   中英

通过读取文本文件从文本文件中删除某些链接

[英]Remove certain links from a textfile by reading textfile

所以我有包含一些链接的whitelist.txt ,以及包含其他链接的scrapedlist.txt ,以及whitelist.txt 中的链接

我正在尝试打开并阅读whitelist.txt ,然后打开并阅读scrapedlist.txt - 写入一个新文件updatedlist2.txt ,该文件将包含scrapedlist.txt减去whitelist.txt 的所有内容。

我对 Python 还是很陌生,所以还在学习。 我已经搜索了答案,这就是我想出的:

def whitelist_file_func():
    with open("whitelist.txt", "r") as whitelist_read:
        whitelist_read.readlines()
    whitelist_read.close()

    unique2 = set()

    with open("scrapedlist.txt", "r") as scrapedlist_read:
        scrapedlist_lines = scrapedlist_read.readlines()
    scrapedlist_read.close()

    unique3 = set()

    with open("updatedlist2.txt", "w") as whitelist_write2:
   
        for line in scrapedlist_lines:
            if unique2 not in line and line not in unique3:
                whitelist_write2.write(line)
                unique3.add(line)

我收到此错误,我也不确定我是否以正确的方式进行操作:

if unique2 not in line and line not in unique3:
TypeError: 'in <string>' requires string as left operand, not set

我应该怎么做才能实现上述目标,而且我的代码是否正确?

编辑:

白名单.txt:

KUWAIT
ISRAEL
FRANCE

刮除清单.txt:

USA
CANADA
GERMANY
KUWAIT
ISRAEL
FRANCE

updatedlist2.txt(应该是这样的):

USA
CANADA
GERMANY

根据您的描述,我对您的代码进行了一些更改。

  1. readlines()方法被替换为read().splitlines() 他们都读取整个文件并将每一行转换为一个列表项。 不同之处在于readlines()在项目末尾包含\n
  2. unique2unique3被删除。 我找不到他们的用法。
  3. 通过前两个部分whitelist_linesscrapedlist_lines是两个包含链接的列表。 根据您的描述,我们需要不在whitelist_lines列表中的scrapedlist_lines行,因此条件if unique2 not in line and line not in unique3:更改为if line not in whitelist_lines:
  4. 写入文件后需要whitelist_write2.close()

最终代码是:

with open("whitelist.txt", "r") as whitelist_read:
    whitelist_lines = whitelist_read.read().splitlines()
    whitelist_read.close()

with open("scrapedlist.txt", "r") as scrapedlist_read:
    scrapedlist_lines = scrapedlist_read.read().splitlines()
    scrapedlist_read.close()

with open("updatedlist2.txt", "w") as whitelist_write2:
    for line in scrapedlist_lines:
        if line not in whitelist_lines:
            whitelist_write2.write(line + "\n")
    whitelist_write2.close()

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM