如何在Windows上設置Stanford CoreNLP服務器以返回文本情緒

Question

我試圖在Windows上使用Stanford CoreNLP設置本地服務器，以計算超過1M文章和視頻文本的情緒分數。 我不懂Java，所以我需要一些幫助。

我成功安裝了Stanford CoreNLP 3.6.0，我運行的服務器運行：

java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer

從我的其他計算機運行此http帖子工作，我收到預期的響應（xxx.xxx.xxx.xxx是服務器的IP地址）：

wget --post-data 'the quick brown fox jumped over the lazy dog' 'xxx.xxx.xxx.xxx:9000/?properties={"tokenize.whitespace": "true", "annotators": "tokenize,ssplit,pos,lemma,parse", "outputFormat": "json"}' -O -

但是，回復並不包含情緒。 顯而易見的解決方案是添加注釋器：

wget --post-data 'the quick brown fox jumped over the lazy dog' 'xxx.xxx.xxx.xxx:9000/?properties={"tokenize.whitespace": "true", "annotators": "tokenize,ssplit,pos,lemma,parse,sentiment", "outputFormat": "json"}' -O -

但是，在服務器端，我收到此錯誤：

java.lang.IllegalArgumentException: Unknown annotator: sentiment
at edu.stanford.nlp.pipeline.StanfordCoreNLP.ensurePrerequisiteAnnotators(StanfordCoreNLP.java:281)
at edu.stanford.nlp.pipeline.StanfordCoreNLPServer$CoreNLPHandler.getProperties(StanfordCoreNLPServer.java:476)
at edu.stanford.nlp.pipeline.StanfordCoreNLP$CoreNLPHandler.handle(StanfordCoreNLPServer.java:350)
at com.sun.net.httpserver.Filter$Chain.doFilter(Unknown Source)
at sun.net.httpserver.AuthFilter.doFilter(Unknown Source)
at com.sun.net.httpserver.Filter$Chain.doFilter(Unknown Source)
at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(Unknown Source)
at com.sun.net.httpserver.Filter$Chain.doFilter(Unknown Source)
at sun.net.httpserver.ServerImpl$Exchange.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.thread.run(Unknown Source)

下一個明顯的解決方案是添加一個參數來啟動服務器，該服務運行：

java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -annotators "tokenize,ssplit,pos,lemma,parse,sentiment"

從之前運行相同的http帖子分別給出相同的確切結果和錯誤。

我做錯了什么，或者是否需要對核心代碼進行一些修改？ 我不懂Java，所以我無法進行這些更改。

作為旁注，這個類似的命令啟動一個控制台，似乎正確加載情緒：

java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators "tokenize,ssplit,pos,lemma,parse,sentiment"

[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[main] INFO edu.stanford.nlp.pipeline.TokenizerAnnotator - TokenizerAnnotator: No tokenizer type provided. Defaulting to PTBTokenizer.
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos
Reading POS tagger model from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [0.5 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz ... done [0.4 sec].
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator sentiment

Entering interactive shell. Type q RETURN or EOF to quit.
NLP> _

Answer 1

嘗試使用GitHub版本的代碼運行。 你的第一個解決方案是正確的 - 它無法找到情緒注釋器的事實是代碼中的錯誤：

wget --post-data 'the quick brown fox jumped over the lazy dog' 'xxx.xxx.xxx.xxx:9000/?properties={"annotators": "tokenize,ssplit,pos,lemma,parse,sentiment", "outputFormat": "json"}' -O -

（旁注： tokenize.whitespace屬性在文檔中，表明您可以傳遞任意屬性，但我建議不要在生產中使用它）。

如何在Windows上設置Stanford CoreNLP服務器以返回文本情緒

問題描述

1 個解決方案

解決方案1
5 已采納 2016-02-10 23:54:02

如何在Windows上設置Stanford CoreNLP服務器以返回文本情緒

問題描述

1 個解決方案

解決方案1 5 已采納 2016-02-10 23:54:02

解決方案1
5 已采納 2016-02-10 23:54:02