简体   繁体   中英

Save NLTK tagger output to a CSV file

I'm trying to analyze a text to find all the 'NN' and 'nnp', so far the code works well, but when I save the output to a CSV file I haven't been able to get the format I want. which is have the - Word, Tag, Question Analyzed-

this is the code:

training_set = []

text = 'I want to analized this text'
tokenized = nltk.word_tokenize(text)
tagged = nltk.pos_tag(tokenized)
result= [(word, tag) for word, tag in tagged if tag in ('NN', 'NNP')]

for i in result:
    training_set.append(i)
    training_set.append([text])
    print(training_set)

listFile2 = open('sample.csv', 'w', newline='')
writer2 = csv.writer(listFile2,quoting=csv.QUOTE_ALL, lineterminator='\n', delimiter=',')
for item in training_set:
    writer2.writerow(item)

The outcome is the following:

在此处输入图片说明

Any idea how can I keep all the information within the same line. like this:

在此处输入图片说明

I have change the code and using two lists and then use Zip to add both to the CSV file, this seems to work however, all close in "" and ()

training_set = []
question = []


        text = 'I want to analyzed this text'
        tokenized = nltk.word_tokenize(text)
        tagged = nltk.pos_tag(tokenized)
        result= [(word, tag) for word, tag in tagged if tag in ('NN', 'NNP')]
        for i in result:
            training_set.append(i)
            question.append([text])

listFile2 = open('sample.csv', 'w', newline='')
writer2 = csv.writer(listFile2,quoting=csv.QUOTE_ALL, lineterminator='\n', delimiter=',')
for item in zip(training_set, question):
    writer2.writerow(item)

Result:

在此处输入图片说明

You can try something like this to get your data in the desired format, before writing it to csv:

[tag + (text,) for tag in result]

OUTPUT:

[('text', 'NN', 'I want to analyze this text')]

It will essentially give you a list of tuples in the format you need, which you can then write to your csv.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM