简体   繁体   English

Stanford CoreNLP给出了NullPointerException

[英]Stanford CoreNLP gives NullPointerException

I'm trying to get my head around the Stanford CoreNLP API. 我试图了解斯坦福CoreNLP API。 I wish to get a simple sentence to be tokenized using following code: 我希望使用以下代码将一个简单的句子标记为:

    Properties props = new Properties();
    props.put("annotators", "tokenize");
    StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

    // read some text in the text variable
    String text = "I wish this code would run.";

    // create an empty Annotation just with the given text
    Annotation document = new Annotation(text);

    // run all Annotators on this text
    pipeline.annotate(document);

    // these are all the sentences in this document
    // a CoreMap is essentially a Map that uses class objects as keys and has values with custom types
    List<CoreMap> sentences = document.get(SentencesAnnotation.class);

    for(CoreMap sentence: sentences) {
        // traversing the words in the current sentence
        // a CoreLabel is a CoreMap with additional token-specific methods
        for (CoreLabel token: sentence.get(TokensAnnotation.class)) {
            // this is the text of the token
            String word = token.get(TextAnnotation.class);
            // this is the POS tag of the token
            String pos = token.get(PartOfSpeechAnnotation.class);
            // this is the NER label of the token
            String ne = token.get(NamedEntityTagAnnotation.class);       
        }

        // this is the parse tree of the current sentence
        Tree tree = sentence.get(TreeAnnotation.class);

        // this is the Stanford dependency graph of the current sentence
        SemanticGraph dependencies = sentence.get(CollapsedCCProcessedDependenciesAnnotation.class);
    }

    // This is the coreference link graph
    // Each chain stores a set of mentions that link to each other,
    // along with a method for getting the most representative mention
    // Both sentence and token offsets start at 1!
    Map<Integer, CorefChain> graph = document.get(CorefChainAnnotation.class);

This is picked off from the Stanford NLP website itself, so I hoped it worked out of the box. 这是从斯坦福NLP网站上挑选出来的,所以我希望它开箱即用。 Sadly it doesn't since it gives me a NullPointerException at: 可悲的是它没有,因为它给我一个NullPointerException:

for(CoreMap sentence: sentences) {...

The code you have picked up from Stanford NLP website performs all the annotations on the text variable. 您从Stanford NLP网站上获取的代码会对文本变量执行所有注释。 In order to perform specific annotations you have to change the code accordingly. 为了执行特定的注释,您必须相应地更改代码。

To perform tokenization, this would be sufficient 要执行标记化,这就足够了

Properties props = new Properties();
props.put("annotators", "tokenize");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

Annotation document = new Annotation(text);
pipeline.annotate(document);
for (CoreLabel token: document.get(TokensAnnotation.class)) {
    String word = token.get(TextAnnotation.class);
}

This line of code would return Null if annotators doesn't include Sentence Splitter("ssplit") 如果注释器不包含Sentence Splitter(“ssplit”),这行代码将返回Null

document.get(SentencesAnnotation.class);

And so you were encountering NullPointerException. 所以你遇到了NullPointerException。

This line retrieves sentence annotations. 该行检索句子注释。

List<CoreMap> sentences = document.get(SentencesAnnotation.class);

But your pipeline contains only the tokenizer, not the sentence splitter. 但是你的管道只包含tokenizer,而不是句子分割器。

Change the following line: 更改以下行:

props.put("annotators", "tokenize, ssplit"); // add sentence splitter

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM