I am using stanford CoreNLP to try to find grammatical relations of noun phrases.
Here is an example:
Given the sentence "The fitness room was dirty."
I managed to identify "The fitness room" as my target noun phrase. I am now looking for a way to find that the "dirty" adjective has a relationship to "the fitness room" and not only to "room".
example code:
private static void doSentenceTest(){
Properties props = new Properties();
props.put("annotators","tokenize, ssplit, pos, lemma, ner, parse, dcoref");
StanfordCoreNLP stanford = new StanfordCoreNLP(props);
TregexPattern npPattern = TregexPattern.compile("@NP");
String text = "The fitness room was dirty.";
// create an empty Annotation just with the given text
Annotation document = new Annotation(text);
// run all Annotators on this text
stanford.annotate(document);
List<CoreMap> sentences = document.get(CoreAnnotations.SentencesAnnotation.class);
for (CoreMap sentence : sentences) {
Tree sentenceTree = sentence.get(TreeCoreAnnotations.TreeAnnotation.class);
TregexMatcher matcher = npPattern.matcher(sentenceTree);
while (matcher.find()) {
//this tree should contain "The fitness room"
Tree nounPhraseTree = matcher.getMatch();
//Question : how do I find that "dirty" has a relationship to the nounPhraseTree
}
// Output dependency tree
TreebankLanguagePack tlp = new PennTreebankLanguagePack();
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
GrammaticalStructure gs = gsf.newGrammaticalStructure(sentenceTree);
Collection<TypedDependency> tdl = gs.typedDependenciesCollapsed();
System.out.println("typedDependencies: "+tdl);
}
}
I used the Stanford CoreNLP on the sentence extracted the root Tree object of it. On this tree object I managed to extract Noun Phrases using a TregexPattern and a TregexMatcher. This gives me a child Tree that contains the actual noun phrase. What I would like to do know is find modifiers of the noun phrase in the original sentence.
The typedDependecies ouptut gives me the following :
typedDependencies: [det(room-3, The-1), nn(room-3, fitness-2), nsubj(dirty-5, room-3), cop(dirty-5, was-4), root(ROOT-0, dirty-5)]
where I can see nsubj(dirty-5, room-3) but I dont have the full noun phrase as dominator.
I hope I am clear enough. Any help appreciated.
The typed dependencies do show that the adjective 'dirty' applies to 'the fitness room':
det(room-3, The-1)
nn(room-3, fitness-2)
nsubj(dirty-5, room-3)
cop(dirty-5, was-4)
root(ROOT-0, dirty-5)
the 'nn' tag is the noun compound modifier , indicating that 'fitness' is a modifier of 'room'.
You can find detailed information on the dependency tags in the Stanford dependency manual .
modify the method
Collection<TypedDependency> tdl = gs.typedDependenciesCollapsed(); with
Collection<TypedDependency> tdl = gs.typedDependenciesCCprocessed();
or
Collection<TypedDependency> tdl = gs.allDependencies();
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.