简体   繁体   中英

Stanford Entity Recognizer (caseless) in Python Nltk

I am trying to figure out how to use the caseless version of the entity recognizer from NLTK. I downloaded http://nlp.stanford.edu/software/stanford-ner-2015-04-20.zip and placed it in the site-packages folder of python. Then I downloaded http://nlp.stanford.edu/software/stanford-corenlp-caseless-2015-04-20-models.jar and placed it in the folder. Then I ran this code in NLTK

from nltk.tag.stanford import NERTagger
english_nertagger = NERTagger(‘/home/anaconda/lib/python2.7/site-packages/stanford-ner-2015-04-20/classifiers/english.conll.4class.distsim.crf.ser.gz’, ‘/home/anaconda/lib/python2.7/site-packages/stanford-ner-2015-04-20/stanford-corenlp-caseless-2015-04-20-models.jar’)

But when I run this:

english_nertagger.tag(‘Rami Eid is studying at stony brook university in NY’.split())

I get an error:

Error: Could not find or load main class edu.stanford.nlp.ie.crf.CRFClassifier

Any help if you have experience is appreciated!

PS I can get the non-caseless version working fine but I find that when analysing search queries, users hardly ever capitalize words and the non-caseless version appears to completely miss words if they are not capitalized.

The second parameter of StanfordNERTagger is the path to the stanford tagger jar file, not the path to the model. So, change it to stanford-ner.jar (and place it there, of course).

Also it seems that you should choose english.conll.4class.caseless.distsim.crf.ser.gz (from stanford-corenlp-caseless-2015-04-20-models.jar) instead of english.conll.4class.distsim.crf.ser.gz

Thus try the following:

 english_nertagger = StanfordNERTagger(‘/home/anaconda/lib/python2.7/site-packages/stanford-ner-2015-04-20/classifiers/english.conll.4class.caseless.distsim.crf.ser.gz’, ‘/home/anaconda/lib/python2.7/site-packages/stanford-ner-2015-04-20/stanford-ner.jar’)

Upd. NERTagger has been renamed to StanfordNERTagger

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM