簡體   English   中英

使用 python 從另一個文件中提取文本

[英]extracting text from another file using python

我在 header“提案”下的 csv 文件(proposal.csv)的一欄中有一個長文本列表。 這包含包含地址的句子(例如建築物名稱和郵政編碼)。 我有另一個 csv 文件(building.csv),在“建築”列下有建築名稱。

我想從提案欄中的句子中提取所有建築物名稱。

有沒有辦法做到這一點? 我花了將近一整天的時間試圖弄清楚這一點,但似乎無法理解。 我使用了df.isin(keywords)方法,但它看起來都是假的。

提案欄中的一行示例 - "i live in taj mahal and it is a very pretty place" 我想提取術語"taj mahal" ,因為它是一座建築( taj mahal列在我的建築 csv 中)。

錯誤見截圖:

錯誤

嘗試這個:

import csv

with open("proposal.csv", 'r') as file:
    sentences = [row[0] for row in csv.reader(file)]

with open("building.csv", 'r') as file:
    buildings = [row[0] for row in csv.reader(file)]

buildings_in_sentences = []
for building in buildings:
    if any(building in sentence for sentence in sentences):
            buildings_in_sentences.append(building)

您還可以使用列表推導:

buildings_in_sentences = [building for building in buildings if any(
    building in sentence for sentence in sentences)]

要使用句子和構建關鍵字將文件寫入 csv,請執行以下操作:

import csv

with open("proposal.csv", 'r') as file:
    sentences = [row[0] for row in csv.reader(file)]

with open("building.csv", 'r') as file:
    buildings = [row[0] for row in csv.reader(file)]

output_rows = [
    [
        sentence, 
        next((b for b in buildings if b in sentence), "not_found")
    ] for sentence in sentences]

with open("proposal.out.csv", "w") as file:
    writer = csv.writer(file)
    writer.writerows(output_rows)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM