[英]How to use a POS Tagger NLTK for an Imported CSV File in Python 3
[英]Save NLTK tagger output to a CSV file
我正在嘗試分析文本以查找所有的'NN'和'nnp',到目前為止,代碼運行良好,但是當我將輸出保存到CSV文件時,我無法獲得所需的格式。 具有-單詞,標簽,問題分析-
這是代碼:
training_set = []
text = 'I want to analized this text'
tokenized = nltk.word_tokenize(text)
tagged = nltk.pos_tag(tokenized)
result= [(word, tag) for word, tag in tagged if tag in ('NN', 'NNP')]
for i in result:
training_set.append(i)
training_set.append([text])
print(training_set)
listFile2 = open('sample.csv', 'w', newline='')
writer2 = csv.writer(listFile2,quoting=csv.QUOTE_ALL, lineterminator='\n', delimiter=',')
for item in training_set:
writer2.writerow(item)
結果如下:
知道如何將所有信息保持在同一行中。 像這樣:
我更改了代碼並使用了兩個列表,然后使用Zip將它們都添加到CSV文件中,但這似乎可行,但是都在“”和()中關閉
training_set = []
question = []
text = 'I want to analyzed this text'
tokenized = nltk.word_tokenize(text)
tagged = nltk.pos_tag(tokenized)
result= [(word, tag) for word, tag in tagged if tag in ('NN', 'NNP')]
for i in result:
training_set.append(i)
question.append([text])
listFile2 = open('sample.csv', 'w', newline='')
writer2 = csv.writer(listFile2,quoting=csv.QUOTE_ALL, lineterminator='\n', delimiter=',')
for item in zip(training_set, question):
writer2.writerow(item)
結果:
在將數據寫入csv之前,可以嘗試執行以下操作以所需的格式獲取數據:
[tag + (text,) for tag in result]
OUTPUT:
[('text', 'NN', 'I want to analyze this text')]
本質上,它將以所需的格式為您提供元組列表,然后您可以將其寫入到csv中。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.