简体   繁体   中英

How can I find grammatical relations of a noun phrase using Stanford Parser or Stanford CoreNLP

I am using stanford CoreNLP to try to find grammatical relations of noun phrases.

Here is an example:

Given the sentence "The fitness room was dirty."

I managed to identify "The fitness room" as my target noun phrase. I am now looking for a way to find that the "dirty" adjective has a relationship to "the fitness room" and not only to "room".

example code:

private static void doSentenceTest(){
    Properties props = new Properties();
    props.put("annotators","tokenize, ssplit, pos, lemma, ner, parse, dcoref");
    StanfordCoreNLP stanford = new StanfordCoreNLP(props);

    TregexPattern npPattern = TregexPattern.compile("@NP");

    String text = "The fitness room was dirty.";


    // create an empty Annotation just with the given text
    Annotation document = new Annotation(text);
    // run all Annotators on this text
    stanford.annotate(document);

    List<CoreMap> sentences = document.get(CoreAnnotations.SentencesAnnotation.class);
    for (CoreMap sentence : sentences) {

        Tree sentenceTree = sentence.get(TreeCoreAnnotations.TreeAnnotation.class);
        TregexMatcher matcher = npPattern.matcher(sentenceTree);

        while (matcher.find()) {
            //this tree should contain "The fitness room" 
            Tree nounPhraseTree = matcher.getMatch();
            //Question : how do I find that "dirty" has a relationship to the nounPhraseTree


        }

        // Output dependency tree
        TreebankLanguagePack tlp = new PennTreebankLanguagePack();
        GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
        GrammaticalStructure gs = gsf.newGrammaticalStructure(sentenceTree);
        Collection<TypedDependency> tdl = gs.typedDependenciesCollapsed();

        System.out.println("typedDependencies: "+tdl); 

    }

}

I used the Stanford CoreNLP on the sentence extracted the root Tree object of it. On this tree object I managed to extract Noun Phrases using a TregexPattern and a TregexMatcher. This gives me a child Tree that contains the actual noun phrase. What I would like to do know is find modifiers of the noun phrase in the original sentence.

The typedDependecies ouptut gives me the following :

typedDependencies: [det(room-3, The-1), nn(room-3, fitness-2), nsubj(dirty-5, room-3), cop(dirty-5, was-4), root(ROOT-0, dirty-5)]

where I can see nsubj(dirty-5, room-3) but I dont have the full noun phrase as dominator.

I hope I am clear enough. Any help appreciated.

The typed dependencies do show that the adjective 'dirty' applies to 'the fitness room':

det(room-3, The-1)
nn(room-3, fitness-2)
nsubj(dirty-5, room-3)
cop(dirty-5, was-4)
root(ROOT-0, dirty-5)

the 'nn' tag is the noun compound modifier , indicating that 'fitness' is a modifier of 'room'.

You can find detailed information on the dependency tags in the Stanford dependency manual .

modify the method

Collection<TypedDependency> tdl = gs.typedDependenciesCollapsed(); with
Collection<TypedDependency> tdl = gs.typedDependenciesCCprocessed();
or
Collection<TypedDependency> tdl = gs.allDependencies(); 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM