通过读取文本文件从文本文件中删除某些链接

Question

所以我有包含一些链接的whitelist.txt ，以及包含其他链接的scrapedlist.txt ，以及whitelist.txt 中的链接。

我正在尝试打开并阅读whitelist.txt ，然后打开并阅读scrapedlist.txt - 写入一个新文件updatedlist2.txt ，该文件将包含scrapedlist.txt减去whitelist.txt 的所有内容。

我对 Python 还是很陌生，所以还在学习。 我已经搜索了答案，这就是我想出的：

def whitelist_file_func():
    with open("whitelist.txt", "r") as whitelist_read:
        whitelist_read.readlines()
    whitelist_read.close()

    unique2 = set()

    with open("scrapedlist.txt", "r") as scrapedlist_read:
        scrapedlist_lines = scrapedlist_read.readlines()
    scrapedlist_read.close()

    unique3 = set()

    with open("updatedlist2.txt", "w") as whitelist_write2:
   
        for line in scrapedlist_lines:
            if unique2 not in line and line not in unique3:
                whitelist_write2.write(line)
                unique3.add(line)

我收到此错误，我也不确定我是否以正确的方式进行操作：

if unique2 not in line and line not in unique3:
TypeError: 'in <string>' requires string as left operand, not set

我应该怎么做才能实现上述目标，而且我的代码是否正确？

编辑：

白名单.txt：

KUWAIT
ISRAEL
FRANCE

刮除清单.txt：

USA
CANADA
GERMANY
KUWAIT
ISRAEL
FRANCE

updatedlist2.txt（应该是这样的）：

USA
CANADA
GERMANY

Answer 1

根据您的描述，我对您的代码进行了一些更改。

readlines()方法被替换为read().splitlines() 。 他们都读取整个文件并将每一行转换为一个列表项。 不同之处在于readlines()在项目末尾包含\n 。
unique2和unique3被删除。 我找不到他们的用法。
通过前两个部分whitelist_lines和scrapedlist_lines是两个包含链接的列表。 根据您的描述，我们需要不在whitelist_lines列表中的scrapedlist_lines行，因此条件if unique2 not in line and line not in unique3:更改为if line not in whitelist_lines: 。
写入文件后需要whitelist_write2.close() 。

最终代码是：

with open("whitelist.txt", "r") as whitelist_read:
    whitelist_lines = whitelist_read.read().splitlines()
    whitelist_read.close()

with open("scrapedlist.txt", "r") as scrapedlist_read:
    scrapedlist_lines = scrapedlist_read.read().splitlines()
    scrapedlist_read.close()

with open("updatedlist2.txt", "w") as whitelist_write2:
    for line in scrapedlist_lines:
        if line not in whitelist_lines:
            whitelist_write2.write(line + "\n")
    whitelist_write2.close()

通过读取文本文件从文本文件中删除某些链接

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-12-25 21:01:28

通过读取文本文件从文本文件中删除某些链接

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-12-25 21:01:28

解决方案1
1 已采纳 2021-12-25 21:01:28