简体   繁体   English

Stanford CoreNLP:如何仅使用名称实体(OpenIE)获得RelationTriple三元组?

[英]Stanford CoreNLP: How to get RelationTriple triples with only Name Entities (OpenIE)?

I'm currently searching in CoreNLP Open Information Extraction (OpenIE) for relation triples (Subject, Predicate, Object) that contains only NameEntities in the Subject and Object types . 我目前的关系三元搜索在CoreNLP开放信息抽取(OpenIE)(主语,谓语,宾语) 包含在主题 NameEntities 和对象类型 But I don't know how to get the entity type of the RelationTriple object that is a List<CoreMap> . 但我不知道如何获取RelationTriple对象的实体类型,即List<CoreMap>

Below is the code from https://stanfordnlp.github.io/CoreNLP/openie.html : 以下是https://stanfordnlp.github.io/CoreNLP/openie.html中的代码:

import edu.stanford.nlp.ie.util.RelationTriple;
import edu.stanford.nlp.ling.CoreAnnotations;
import edu.stanford.nlp.pipeline.Annotation;
import edu.stanford.nlp.pipeline.StanfordCoreNLP;
import edu.stanford.nlp.naturalli.NaturalLogicAnnotations;
import edu.stanford.nlp.util.CoreMap;

import java.util.Collection;
import java.util.Properties;

/**
 * A demo illustrating how to call the OpenIE system programmatically.
 */
public class OpenIEDemo {

  public static void main(String[] args) throws Exception {
    // Create the Stanford CoreNLP pipeline
    Properties props = new Properties();
    props.setProperty("annotators", "tokenize,ssplit,pos,lemma,depparse,natlog,openie");
    StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

    // Annotate an example document.
    Annotation doc = new Annotation("Obama was born in Hawaii. He is our president.");
    pipeline.annotate(doc);

    // Loop over sentences in the document
    for (CoreMap sentence : doc.get(CoreAnnotations.SentencesAnnotation.class)) {
      // Get the OpenIE triples for the sentence
      Collection <RelationTriple> triples = sentence.get(NaturalLogicAnnotations.RelationTriplesAnnotation.class);
      // Print the triples
      for (RelationTriple triple : triples) {
      // Here is where I get the entity type from a triple's subject or object
        System.out.println(triple.confidence + "\t" +
            triple.subjectLemmaGloss() + "\t" +
            triple.relationLemmaGloss() + "\t" +
            triple.objectLemmaGloss());
      }
    }
  }
}

If there exists some way to get the entity type from RelationTriple class I would be grateful for the help. 如果存在某种从RelationTriple类获取实体类型的方法 ,我将不胜感激。

The subject and object instance variables should be lists of CoreLabl s, which have named entity information attached via the #ner() method. subjectobject实例变量应该是CoreLabl的列表,它们具有通过#ner()方法附加的命名实体信息。 Something like the following should do what you want: 像下面这样的事情应该做你想要的:

Collection<RelationTriple> triples = sentence.get(RelationTriplesAnnotation.class);
List<RelationTriple> withNE = triples.stream()
    // make sure the subject is entirely named entities
    .filter( triple -> 
        triple.subject.stream().noneMatch(token -> "O".equals(token.ner())))
    // make sure the object is entirely named entities
    .filter( triple -> 
        triple.object.stream().noneMatch(token -> "O".equals(token.ner())))
    // Convert the stream back to a list
    .collect(Collectors.toList());

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM