![](/img/trans.png)
[英]Extracting the StatusDescription from a text file using Python
[英]extracting text from another file using python
我在 header“提案”下的 csv 文件(proposal.csv)的一欄中有一個長文本列表。 這包含包含地址的句子(例如建築物名稱和郵政編碼)。 我有另一個 csv 文件(building.csv),在“建築”列下有建築名稱。
我想從提案欄中的句子中提取所有建築物名稱。
有沒有辦法做到這一點? 我花了將近一整天的時間試圖弄清楚這一點,但似乎無法理解。 我使用了df.isin(keywords)
方法,但它看起來都是假的。
提案欄中的一行示例 - "i live in taj mahal and it is a very pretty place"
。 我想提取術語"taj mahal"
,因為它是一座建築( taj mahal
列在我的建築 csv 中)。
錯誤見截圖:
嘗試這個:
import csv
with open("proposal.csv", 'r') as file:
sentences = [row[0] for row in csv.reader(file)]
with open("building.csv", 'r') as file:
buildings = [row[0] for row in csv.reader(file)]
buildings_in_sentences = []
for building in buildings:
if any(building in sentence for sentence in sentences):
buildings_in_sentences.append(building)
您還可以使用列表推導:
buildings_in_sentences = [building for building in buildings if any(
building in sentence for sentence in sentences)]
要使用句子和構建關鍵字將文件寫入 csv,請執行以下操作:
import csv
with open("proposal.csv", 'r') as file:
sentences = [row[0] for row in csv.reader(file)]
with open("building.csv", 'r') as file:
buildings = [row[0] for row in csv.reader(file)]
output_rows = [
[
sentence,
next((b for b in buildings if b in sentence), "not_found")
] for sentence in sentences]
with open("proposal.out.csv", "w") as file:
writer = csv.writer(file)
writer.writerows(output_rows)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.