繁体   English   中英

使用 python 从另一个文件中提取文本

[英]extracting text from another file using python

我在 header“提案”下的 csv 文件(proposal.csv)的一栏中有一个长文本列表。 这包含包含地址的句子(例如建筑物名称和邮政编码)。 我有另一个 csv 文件(building.csv),在“建筑”列下有建筑名称。

我想从提案栏中的句子中提取所有建筑物名称。

有没有办法做到这一点? 我花了将近一整天的时间试图弄清楚这一点,但似乎无法理解。 我使用了df.isin(keywords)方法,但它看起来都是假的。

提案栏中的一行示例 - "i live in taj mahal and it is a very pretty place" 我想提取术语"taj mahal" ,因为它是一座建筑( taj mahal列在我的建筑 csv 中)。

错误见截图:

错误

尝试这个:

import csv

with open("proposal.csv", 'r') as file:
    sentences = [row[0] for row in csv.reader(file)]

with open("building.csv", 'r') as file:
    buildings = [row[0] for row in csv.reader(file)]

buildings_in_sentences = []
for building in buildings:
    if any(building in sentence for sentence in sentences):
            buildings_in_sentences.append(building)

您还可以使用列表推导:

buildings_in_sentences = [building for building in buildings if any(
    building in sentence for sentence in sentences)]

要使用句子和构建关键字将文件写入 csv,请执行以下操作:

import csv

with open("proposal.csv", 'r') as file:
    sentences = [row[0] for row in csv.reader(file)]

with open("building.csv", 'r') as file:
    buildings = [row[0] for row in csv.reader(file)]

output_rows = [
    [
        sentence, 
        next((b for b in buildings if b in sentence), "not_found")
    ] for sentence in sentences]

with open("proposal.out.csv", "w") as file:
    writer = csv.writer(file)
    writer.writerows(output_rows)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM