[英]Stanford CoreNLP gives NullPointerException
我試圖了解斯坦福CoreNLP API。 我希望使用以下代碼將一個簡單的句子標記為:
Properties props = new Properties();
props.put("annotators", "tokenize");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
// read some text in the text variable
String text = "I wish this code would run.";
// create an empty Annotation just with the given text
Annotation document = new Annotation(text);
// run all Annotators on this text
pipeline.annotate(document);
// these are all the sentences in this document
// a CoreMap is essentially a Map that uses class objects as keys and has values with custom types
List<CoreMap> sentences = document.get(SentencesAnnotation.class);
for(CoreMap sentence: sentences) {
// traversing the words in the current sentence
// a CoreLabel is a CoreMap with additional token-specific methods
for (CoreLabel token: sentence.get(TokensAnnotation.class)) {
// this is the text of the token
String word = token.get(TextAnnotation.class);
// this is the POS tag of the token
String pos = token.get(PartOfSpeechAnnotation.class);
// this is the NER label of the token
String ne = token.get(NamedEntityTagAnnotation.class);
}
// this is the parse tree of the current sentence
Tree tree = sentence.get(TreeAnnotation.class);
// this is the Stanford dependency graph of the current sentence
SemanticGraph dependencies = sentence.get(CollapsedCCProcessedDependenciesAnnotation.class);
}
// This is the coreference link graph
// Each chain stores a set of mentions that link to each other,
// along with a method for getting the most representative mention
// Both sentence and token offsets start at 1!
Map<Integer, CorefChain> graph = document.get(CorefChainAnnotation.class);
這是從斯坦福NLP網站上挑選出來的,所以我希望它開箱即用。 可悲的是它沒有,因為它給我一個NullPointerException:
for(CoreMap sentence: sentences) {...
您從Stanford NLP網站上獲取的代碼會對文本變量執行所有注釋。 為了執行特定的注釋,您必須相應地更改代碼。
要執行標記化,這就足夠了
Properties props = new Properties();
props.put("annotators", "tokenize");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
Annotation document = new Annotation(text);
pipeline.annotate(document);
for (CoreLabel token: document.get(TokensAnnotation.class)) {
String word = token.get(TextAnnotation.class);
}
如果注釋器不包含Sentence Splitter(“ssplit”),這行代碼將返回Null
document.get(SentencesAnnotation.class);
所以你遇到了NullPointerException。
該行檢索句子注釋。
List<CoreMap> sentences = document.get(SentencesAnnotation.class);
但是你的管道只包含tokenizer,而不是句子分割器。
更改以下行:
props.put("annotators", "tokenize, ssplit"); // add sentence splitter
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.