简体   繁体   中英

Python NLTK: Stanford NER tagger error message: NLTK was unable to find the java file

Trying to get Stanford NER working with Python. Followed some instructions on the web, but got the error message: "NLTK was unable to find the java file! Use software specific configuration paramaters or set the JAVAHOME environment variable." What was wrong? Thank you!

from nltk.tag.stanford import StanfordNERTagger
from nltk.tokenize import word_tokenize

model = r'C:\Stanford\NER\classifiers\english.muc.7class.distsim.crf.ser.gz'
jar = r'C:\Stanford\NER\stanford-ner-3.9.1.jar'

ner_tagger = StanfordNERTagger(model, jar, encoding = 'utf-8')

text = 'While in France, Christine Lagarde discussed short-term stimulus ' \
       'efforts in a recent interview with the Wall Street Journal.'

words = word_tokenize(text)
classified_words = ner_tagger.tag(words)

Found the solution on the web. Replace the path with your own.

  import os java_path = "C:/../../jdk1.8.0_101/bin/java.exe" os.environ['JAVAHOME'] = java_path 

or:

 import nltk nltk.internals.config_java('C:/../../jdk1.8.0_101/bin/java.exe') 

Source: https://tianyouhu.wordpress.com/2016/09/01/problem-of-nltk-with-stanfordtokenizer/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM