简体   繁体   English

Stanford CoreNLP服务器的JSON响应缺少RelationExtractor批注

[英]Stanford CoreNLP server's JSON response missing RelationExtractor annotations

I'm processing a simple sentence to test Stanford's RelationExtractor : 我正在处理一个简单的句子来测试Stanford的RelationExtractor

Microsoft is based in New York. 微软总部位于纽约。

(it's not) (不是)

When I'm annotating the sentence in Java, by directly using the CoreNLP jar files I get the wanted result - CoreNLP finds a OrgBased_In relation between Microsoft and New York . 当我用Java注释句子时,通过直接使用CoreNLP jar文件,我得到了想要的结果-CoreNLPMicrosoftNew York之间找到了OrgBased_In关系。

for (CoreMap sentence : sentences) {
    relationType = sentence.get(MachineReadingAnnotations.RelationMentionsAnnotation.class).get(0).type // => OrgBased_In
}

However, sending the same sentence into the CoreNLP Server like so: 但是,将相同的句子发送到CoreNLP Server中,如下所示:

curl --data 'Microsoft is based in New York.' 'http://localhost:9000/?properties={%22annotators%22%3A%22tokenize%2Cssplit%2Cpos%2Clemma%2Cner%2Cparse%2Cdepparse%2Crelation%22%2C%22outputFormat%22%3A%22json%22}' -o -

Results in a json response that contains no data on relations whatsoever: 导致json响应,其中不包含任何关系数据:

{'sentences': [{'basicDependencies': [{'dep': 'ROOT',
                                   'dependent': 3,
                                   'dependentGloss': 'based',
                                   'governor': 0,
                                   'governorGloss': 'ROOT'},
                                  {'dep': 'nsubjpass',
                                   'dependent': 1,
                                   'dependentGloss': 'Microsoft',
                                   'governor': 3,
                                   'governorGloss': 'based'},
                                  {'dep': 'auxpass',
                                   'dependent': 2,
                                   'dependentGloss': 'is',
                                   'governor': 3,
                                   'governorGloss': 'based'},
                                  {'dep': 'case',
                                   'dependent': 4,
                                   'dependentGloss': 'in',
                                   'governor': 6,
                                   'governorGloss': 'York'},
                                  {'dep': 'compound',
                                   'dependent': 5,
                                   'dependentGloss': 'New',
                                   'governor': 6,
                                   'governorGloss': 'York'},
                                  {'dep': 'nmod',
                                   'dependent': 6,
                                   'dependentGloss': 'York',
                                   'governor': 3,
                                   'governorGloss': 'based'},
                                  {'dep': 'punct',
                                   'dependent': 7,
                                   'dependentGloss': '.',
                                   'governor': 3,
                                   'governorGloss': 'based'}],
            'enhancedDependencies': [{'dep': 'ROOT',
                                      'dependent': 3,
                                      'dependentGloss': 'based',
                                      'governor': 0,
                                      'governorGloss': 'ROOT'},
                                     {'dep': 'nsubjpass',
                                      'dependent': 1,
                                      'dependentGloss': 'Microsoft',
                                      'governor': 3,
                                      'governorGloss': 'based'},
                                     {'dep': 'auxpass',
                                      'dependent': 2,
                                      'dependentGloss': 'is',
                                      'governor': 3,
                                      'governorGloss': 'based'},
                                     {'dep': 'case',
                                      'dependent': 4,
                                      'dependentGloss': 'in',
                                      'governor': 6,
                                      'governorGloss': 'York'},
                                     {'dep': 'compound',
                                      'dependent': 5,
                                      'dependentGloss': 'New',
                                      'governor': 6,
                                      'governorGloss': 'York'},
                                     {'dep': 'nmod:in',
                                      'dependent': 6,
                                      'dependentGloss': 'York',
                                      'governor': 3,
                                      'governorGloss': 'based'},
                                     {'dep': 'punct',
                                      'dependent': 7,
                                      'dependentGloss': '.',
                                      'governor': 3,
                                      'governorGloss': 'based'}],
            'enhancedPlusPlusDependencies': [{'dep': 'ROOT',
                                              'dependent': 3,
                                              'dependentGloss': 'based',
                                              'governor': 0,
                                              'governorGloss': 'ROOT'},
                                             {'dep': 'nsubjpass',
                                              'dependent': 1,
                                              'dependentGloss': 'Microsoft',
                                              'governor': 3,
                                              'governorGloss': 'based'},
                                             {'dep': 'auxpass',
                                              'dependent': 2,
                                              'dependentGloss': 'is',
                                              'governor': 3,
                                              'governorGloss': 'based'},
                                             {'dep': 'case',
                                              'dependent': 4,
                                              'dependentGloss': 'in',
                                              'governor': 6,
                                              'governorGloss': 'York'},
                                             {'dep': 'compound',
                                              'dependent': 5,
                                              'dependentGloss': 'New',
                                              'governor': 6,
                                              'governorGloss': 'York'},
                                             {'dep': 'nmod:in',
                                              'dependent': 6,
                                              'dependentGloss': 'York',
                                              'governor': 3,
                                              'governorGloss': 'based'},
                                             {'dep': 'punct',
                                              'dependent': 7,
                                              'dependentGloss': '.',
                                              'governor': 3,
                                              'governorGloss': 'based'}],
            'index': 0,
            'parse': '(ROOT\n'
                     '  (S\n'
                     '    (NP (NNP Microsoft))\n'
                     '    (VP (VBZ is)\n'
                     '      (VP (VBN based)\n'
                     '        (PP (IN in)\n'
                     '          (NP (NNP New) (NNP York)))))\n'
                     '    (. .)))',
            'tokens': [{'after': ' ',
                        'before': '',
                        'characterOffsetBegin': 0,
                        'characterOffsetEnd': 9,
                        'index': 1,
                        'lemma': 'Microsoft',
                        'ner': 'ORGANIZATION',
                        'originalText': 'Microsoft',
                        'pos': 'NNP',
                        'word': 'Microsoft'},
                       {'after': ' ',
                        'before': ' ',
                        'characterOffsetBegin': 10,
                        'characterOffsetEnd': 12,
                        'index': 2,
                        'lemma': 'be',
                        'ner': 'O',
                        'originalText': 'is',
                        'pos': 'VBZ',
                        'word': 'is'},
                       {'after': ' ',
                        'before': ' ',
                        'characterOffsetBegin': 13,
                        'characterOffsetEnd': 18,
                        'index': 3,
                        'lemma': 'base',
                        'ner': 'O',
                        'originalText': 'based',
                        'pos': 'VBN',
                        'word': 'based'},
                       {'after': ' ',
                        'before': ' ',
                        'characterOffsetBegin': 19,
                        'characterOffsetEnd': 21,
                        'index': 4,
                        'lemma': 'in',
                        'ner': 'O',
                        'originalText': 'in',
                        'pos': 'IN',
                        'word': 'in'},
                       {'after': ' ',
                        'before': ' ',
                        'characterOffsetBegin': 22,
                        'characterOffsetEnd': 25,
                        'index': 5,
                        'lemma': 'New',
                        'ner': 'LOCATION',
                        'originalText': 'New',
                        'pos': 'NNP',
                        'word': 'New'},
                       {'after': '',
                        'before': ' ',
                        'characterOffsetBegin': 26,
                        'characterOffsetEnd': 30,
                        'index': 6,
                        'lemma': 'York',
                        'ner': 'LOCATION',
                        'originalText': 'York',
                        'pos': 'NNP',
                        'word': 'York'},
                       {'after': '',
                        'before': '',
                        'characterOffsetBegin': 30,
                        'characterOffsetEnd': 31,
                        'index': 7,
                        'lemma': '.',
                        'ner': 'O',
                        'originalText': '.',
                        'pos': '.',
                        'word': '.'}]}]}

I can see on the CoreNLP server terminal that the relation extraction model is loaded. 我可以看到,相对于提取模型加载 CoreNLP服务器终端上。

[pool-1-thread-1] INFO edu.stanford.nlp.pipeline.RelationExtractorAnnotator - Loading relation model from edu/stanford/nlp/models/supervised_relation_extractor/roth_relation_model_pipelineNER.ser

What am I missing here? 我在这里想念什么?

Thanks! 谢谢!

I think ultimately nobody added that output to the JSON for that annotator, which we can do eventually. 我认为最终没有人将该输出添加到该注释器的JSON中,我们最终可以做到。

Right now the relation extraction we are mainly supporting is the new kbp annotator. 现在,我们主要支持的关系提取是新的kbp注释器。 This extracts the relations from the TAC-KBP challenge. 这从TAC-KBP挑战中提取关系。

You can find the relation descriptions here: https://tac.nist.gov//2015/KBP/ColdStart/guidelines/TAC_KBP_2015_Slot_Descriptions_V1.0.pdf 您可以在这里找到关系描述: https : //tac.nist.gov//2015/KBP/ColdStart/guidelines/TAC_KBP_2015_Slot_Descriptions_V1.0.pdf

Here is an example command that I ran: 这是我运行的示例命令:

java -Xmx8g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,parse,mention,entitymentions,coref,kbp -file microsoft-example.txt -outputFormat json

If you look at the JSON you'll see the proper relation has been extracted. 如果查看JSON,您将看到已提取正确的关系。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM