Python NLTK: Stanford NER tagger error message: NLTK was unable to find the java file

Question

Trying to get Stanford NER working with Python. Followed some instructions on the web, but got the error message: "NLTK was unable to find the java file! Use software specific configuration paramaters or set the JAVAHOME environment variable." What was wrong? Thank you!

from nltk.tag.stanford import StanfordNERTagger
from nltk.tokenize import word_tokenize

model = r'C:\Stanford\NER\classifiers\english.muc.7class.distsim.crf.ser.gz'
jar = r'C:\Stanford\NER\stanford-ner-3.9.1.jar'

ner_tagger = StanfordNERTagger(model, jar, encoding = 'utf-8')

text = 'While in France, Christine Lagarde discussed short-term stimulus ' \
       'efforts in a recent interview with the Wall Street Journal.'

words = word_tokenize(text)
classified_words = ner_tagger.tag(words)

Answer 1

Found the solution on the web. Replace the path with your own.

  import os java_path = "C:/../../jdk1.8.0_101/bin/java.exe" os.environ['JAVAHOME'] = java_path

or:

 import nltk nltk.internals.config_java('C:/../../jdk1.8.0_101/bin/java.exe')

Source: https://tianyouhu.wordpress.com/2016/09/01/problem-of-nltk-with-stanfordtokenizer/

Python NLTK: Stanford NER tagger error message: NLTK was unable to find the java file

Question

1 answers

solution1
0 ACCPTED 2018-10-18 15:08:36

Python NLTK: Stanford NER tagger error message: NLTK was unable to find the java file

Question

1 answers

solution1 0 ACCPTED 2018-10-18 15:08:36

solution1
0 ACCPTED 2018-10-18 15:08:36