简体   繁体   中英

How to use stanford English Tagger in python Mac OS

I successfully run the stanford english tagger, like below: input: "The picture is clear" output:

[[(u'This', u'DT'), (u'picture', u'NN'), (u'is', u'VBZ'), (u'clear', u'JJ')]]

But I want to read the whole file, and wish output is like this:

This_DT picture_NN is_VBZ clear_JJ

Like a sentence. Not a format in brackets. But I don't know how to change it in python.

My original code

import nltk
from nltk.tag.stanford import POSTagger
st = POSTagger('/Users/apple/Desktop/package/stanford-postagger/models/english-left3words-distsim.tagger', '/Users/apple/Desktop/package/stanford-postagger/stanford-postagger.jar')

print st.tag('This picture is clear'.split())

Fairly straightforward list/tuple/string manipulation:

inp = [[(u'This', u'DT'), (u'picture', u'NN'), (u'is', u'VBZ'), (u'clear', u'JJ')]]

out = []
for t in inp[0]:
    out += t

outs = "_".join(out)
print outs

The data you have is a list of list of tuples. We are only interested in the first element - hence the inp[0] .

We iterate through this list (I could have used a list comprehension) extracting the elements of the tuple ( t ), creating another list ( out ). It is then a simple task to join the elements together with an underscore to produce a string.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM