简体   繁体   English

NLTK无法找到Java可执行文件

[英]NLTK fails to find the Java executable

I am using NLTK's nltk.tag.stanford, which needs to call the java executable. 我正在使用NLTK的nltk.tag.stanford,它需要调用java可执行文件。

I set JAVAHOME to C:\\Program Files\\Java\\jdk1.6.0_25 where my jdk is installed, but when run the program I get the error 我将JAVAHOME设置为安装了我的jdk的C:\\ Program Files \\ Java \\ jdk1.6.0_25但是在运行程序时出现错误

"NLTK was unable to find the java executable! Use the config_java() or set the JAVAHOME variable"

Then I spent 3 hours on debugging it and tried 然后我花了3个小时来调试它并尝试了

config_java("C:/Program Files/Java/jdk1.6.0_25/")

config_java("C:/Program Files/Java/jdk1.6.0_25/bin/")
and those without the ending "/". 

However the nltk still cannot find it. 然而,nltk仍然无法找到它。

Anyone has idea about what's going wrong? 任何人都知道出了什么问题? Thanks a loooot! 谢谢你的loooot!

If setting the JAVA_HOME environment doesn't help you, try this: 如果设置JAVA_HOME环境对您没有帮助,请尝试以下操作:

config_java() did not work for me. config_java()不适合我。 I add the following lines to my code and it worked: 我将以下代码添加到我的代码中并且它有效:

import os
java_path = "C:/Program Files/Java/jdk1.7.0_11/bin/java.exe"
os.environ['JAVAHOME'] = java_path

I am running Windows 7 64-bit 我正在运行Windows 7 64位

I spent about seven hours working through this problem, and finally found a solution. 我花了大约七个小时来解决这个问题,最后找到了解决方案。 You can write your java directory right into lines 69 and 72 of the internals.py file (build 2.0.4) as follows: 您可以将java目录直接写入internals.py文件(build 2.0.4)的第69和72行,如下所示:

##########################################################################
# Java Via Command-Line
##########################################################################

_java_bin = 'C:\Program Files\Java\jdk1.7.0_25\\bin\java.exe'
_java_options = []
# [xx] add classpath option to config_java?
def config_java(bin='C:\Program Files\Java\jdk1.7.0_25\\bin\java.exe', options=None, verbose=True):

This resolves the problem for me. 这解决了我的问题。 (I'm working in a 32 bit Windows environment) (我在32位Windows环境下工作)

protos1210's tip worked for me, with a few minor changes. protos1210的提示对我有用,只有一些小的改动。 The full answer is: 完整的答案是:

import nltk
nltk.internals.config_java("C:/Program Files/Java/jdk1.6.0_30/bin/java.exe")

After I restarted IDLE, the following code worked. 重新启动IDLE后,以下代码有效。

import nltk
path_to_model = "C:/Program Files/stanford-postagger-2012-05-22/models/english-bidirectional-distsim.tagger"
path_to_jar = "C:/Program Files/stanford-postagger-2012-05-22/stanford-postagger.jar"
tagger = nltk.tag.stanford.POSTagger(path_to_model, path_to_jar)
tokens = nltk.tokenize.word_tokenize("I hope this works!")
print tagger.tag(tokens)

Output is: [('I', 'PRP'), ('hope', 'VBP'), ('this', 'DT'), ('works', 'VBZ'), ('!', '.')]. 输出是:[('我','PRP'),('希望','VBP'),('这','DT'),('工作','VBZ'),('!',' 。')]。

I never could get it to recognize my JAVAHOME environment variables. 我永远无法识别我的JAVAHOME环境变量。

我看了这里 ,文档似乎表明这个论点应该是这样的

config_java("C:/Program Files/Java/jdk1.6.0_25/bin/java")

depending on your environment you might want to try reinstalling the nltk binary. 根据您的环境,您可能需要尝试重新安装nltk二进制文件。 I installed from binary and then later upgraded via easy_install and it incorrectly installed the osx version of nltk which caused exceptions when ntlk couldn't find my java binary. 我从二进制安装,然后通过easy_install升级,它错误地安装了nltk的osx版本,当ntlk找不到我的java二进制文件时导致异常。

Another possibility when facing this error message while using the stanford package in NLTK is if you use StanfordTagger instead of PosTagger or NERTagger . 在NLTK中使用stanford包时遇到此错误消息的另一种可能性是使用StanfordTagger而不是PosTaggerNERTagger According to Google Groups , there was a design to encourage users away from the general StanfordTagger class and towards one of the two specific taggers. 根据Google网上论坛的说法,有一种设计可以鼓励用户远离一般的StanfordTagger课程,并转向两个特定的标记之一。

Another distinct answer for this situation is you are using an IDE such as Eclipse. 对于这种情况,另一个明显的答案是您正在使用Eclipse等IDE。 Even if you have set your JAVA_HOME environment variable and even if you explicitly call config_java and you get the [Found ... /bin/java.exe] message returned to you, you could still have to set the runtime environment for your IDE. 即使您已设置JAVA_HOME环境变量,即使您显式调用config_java并且返回[Found ... /bin/java.exe]消息,您仍可能必须为IDE设置运行时环境。 The reason is that when you invoke the tagger, config_java is called again as part of the process and your original attempts at settings the path to the java binary executable can therefore be overwritten. 原因是当您调用标记器时,将再次调用config_java作为进程的一部分,因此可以覆盖原始尝试设置java二进制可执行文件的路径。

I realize that this is an old question but here is the solution that worked for me (running on Windows 7-64 bit). 我意识到这是一个老问题,但这里是适用于我的解决方案(在Windows 7-64位上运行)。 Hopefully it will save someone some time. 希望它能节省一些时间。

I implemented the solution given here : 我实现了这里给出的解决方案:

 "I have been able to get it working by commenting out two lines in the batch_tag function in     
 \nltk\tag\stanford.py

  The lines are line 59 and 85.

 config_java(options=self.java_options, verbose=False)
 and 
 config_java(options=default_options, verbose=False)
 respectively."

After commenting out the lines I set the path to the Java executable in the same manner mentioned in other answers: 在注释掉行之后,我以与其他答案中提到的相同方式设置Java可执行文件的路径:

 nltk.internals.config_java("path/to/javadk/bin/java.exe")

A kludgey but workable solution. 一个kludgey但可行的解决方案。 Everything worked fine after that. 之后一切都很好。

Hopefully this saves someone else some time when trying to fix this problem. 希望这可以在尝试解决此问题时节省其他人一些时间。 I'm pretty new to programming, Python and the NLTK, and didn't realize when I was trying to implement @dduhaime's solution that there are two 'internals.py' files: one in the nltk folder (path=C:\\nltk-2.0.4 on my computer) and one in my Python27 folder (path=C:\\Python27\\Lib\\site-packages\\nltk-2.0.4-py2.7.egg\\nltk on my computer). 我是编程,Python和NLTK的新手,并没有意识到当我尝试实现@dduhaime的解决方案时,有两个'internals.py'文件:一个在nltk文件夹中(path = C:\\ nltk在我的计算机上的-2.0.4和我的Python27文件夹中的一个(我的计算机上的路径= C:\\ Python27 \\ Lib \\ site-packages \\ nltk-2.0.4-py2.7.egg \\ nltk)。 You have to add the path to the java directory on lines 69 & 72 in the latter 'internals.py' file, or the NLTK will still not be able to find it. 您必须在后面的'internals.py'文件中的第69和72行添加java目录的路径,否则NLTK仍然无法找到它。

My environment: Windows 7 64 bit, NLTK build 2.0.4 我的环境:Windows 7 64位,NLTK build 2.0.4

I have tried all the above mentioned solutions and also the ones on Google Groups , but none worked. 我已经尝试了所有上述解决方案以及Google网上论坛上的解决方案,但都没有效果。 So after few more rounds of trial and modifications to above answers, the following piece of code worked for me :- 因此,经过对上述答案的几轮试验和修改后,以下代码对我有用: -

>>>  import os

>>>  os.environ['JAVAHOME'] = "C:/Program Files/Java/jdk1.8.0_31/bin" #insert approriate version of jdk

And then I tried NERTagger code :- 然后我尝试了NERTagger代码: -

>>> from nltk.tag.stanford import NERTagger

>>> st = NERTagger('stanford-ner-2014-06-16/classifiers/english.all.3class.distsim.crf.ser.gz','stanford-ner-2014-06-16/stanford-ner.jar')

>>> st.tag('John has refused the offer from Facebook. He will work for Google'.split())

And the following was the output I received 以下是我收到的输出

'John', u'PERSON'), (u'has', u'O'), (u'refused', u'O'), (u'the', u'O'), (u'offer', u'O'), (u'from', u'O'), (u'Facebook', u'ORGANIZATION'), (u'.', u'O')]

Tested on Windows 7 64-bit 在Windows 7 64位上测试过

I too have been running into problems with this. 我也遇到过这个问题。 It has been such a headache! 一直很头疼!

I got this to work on my machine (Win7_x64) 我让这个在我的机器上工作(Win7_x64)

Replace 'jdk1.6.0_30' with your version of the jdk. 将'jdk1.6.0_30'替换为您的jdk版本。 Run this command: 运行此命令:

config_java("C:/Program Files/Java/jdk1.6.0_30/bin/java.exe")
[Found C:/Program Files/Java/jdk1.6.0_30/bin/java.exe: C:/Program Files/Java/jdk1.6.0_30/bin/java.exe]

I do not know why it has been this difficult to get working. 我不知道为什么这么难以开始工作。 Hope this helps! 希望这可以帮助!

I implemented a workaround for this because NLTK is misunderstanding the meaning of the JAVA_HOME variable: 我实现了一个解决方法,因为NLTK误解了JAVA_HOME变量的含义:

import os
if os.environ.get("JAVA_HOME") is not None and "/bin" not in os.environ["JAVA_HOME"]:
    os.environ["JAVAHOME"] = os.path.normpath(os.path.join(os.environ["JAVA_HOME"], "bin"))

This basically takes the correct value you have in JAVA_HOME, and creates the NLTK-friendly version and stores it in JAVAHOME. 这基本上采用了JAVA_HOME中的正确值,并创建了NLTK友好版本并将其存储在JAVAHOME中。 NLTK will check both so this will find the binary. NLTK将检查两者,这样就可以找到二进制文件。 You need to do this before the tagger is created, obviously. 显然,您需要在创建标记器之前执行此操作。

I came across the same issue and this is what worked for me which is really simple. 我遇到了同样的问题,这对我来说非常简单。 When you are setting up JavaHome variable set the path to jdk folder in your machine like below: 在设置JavaHome变量时,在机器中设置jdk文件夹的路径,如下所示:

C:\\Program Files\\Java\\ jdk\\ - This did work C:\\ Program Files \\ Java \\ jdk \\ - 这确实有效

C:\\Program Files\\Java\\jdk - This did not work C:\\ Program Files \\ Java \\ jdk - 这不起作用

This answer is for ubuntu 14.04 . 这个答案适用于ubuntu 14.04。

commenting out two lines in the batch_tag function in \\nltk\\tag\\stanford.py \\ nltk \\ tag \\ stanford.py中的batch_tag函数中注释掉两行

The lines are line 59 and 85. 这些线是59和85线。

config_java(options=self.java_options, verbose=False) and config_java(options=default_options, verbose=False) respectively. config_java(options = self.java_options,verbose = False)config_java(options = default_options,verbose = False)

After commenting out the lines I set the path to the Java executable in the same manner mentioned in other answers: nltk.internals.config_java("path/to/javadk/bin/java") 在注释掉这些行之后,我以与其他答案中提到的相同的方式设置Java可执行文件的路径: nltk.internals.config_java(“path / to / javadk / bin / java”)

Everything worked fine after that. 之后一切都很好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM