简体   繁体   中英

Using tree surgeon to adjoin POS to NN

I'm trying to work through some Tree objects and need to adjoin the "'s" possessive (POS) nodes to their respective nouns (NN).

I'm currently hoping the tsurgeon tools will do this, and they do indeed seem designed to the task. However, my errors are strange and non productive.

I'll try and set this up as best as possible with the context of the application and the output seen, a small test program has been written to figure out this use case, but I'm afraid even that is somewhat large and complex, os please excuse my setup.

List<CoreMap> sentences = annotation.get(CoreAnnotations.SentencesAnnotation.class);
//Pattern: http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/trees/tregex/TregexPattern.html
TregexPattern adjoinPOS = TregexPattern.compile("POS=pos , NN=noun");
TsurgeonPattern tsurgeon = Tsurgeon.parseOperation("adjoin pos@ noun");
for( CoreMap sentence : sentences ) {
   Tree tree = sentence.get(TreeCoreAnnotations.TreeAnnotation.class);
   tree = Tsurgeon.processPattern(adjoinPOS, tsurgeon, tree);
   tree.pennPrint();
}

This, unfortunately, does nothing productive, instead I get a null pointer exception within stanford nlp:

 Exception in thread "main" java.lang.NullPointerException at edu.stanford.nlp.trees.tregex.tsurgeon.AdjoinNode$Matcher.evaluate(AdjoinNode.java:49) at edu.stanford.nlp.trees.tregex.tsurgeon.TsurgeonPatternRoot$Matcher.evaluate(TsurgeonPatternRoot.java:63) at edu.stanford.nlp.trees.tregex.tsurgeon.Tsurgeon.processPattern(Tsurgeon.java:579) 
at my.code.line of the processPattern call (yeah, I cleaned this up a little for brevity)

Let's assume that the sentence tree is:

(ROOT (SBARQ (WHNP (WP What)) (SQ (VBZ is) (NP (NP (NP (DT the) (NN cannonball) (POS 's)) (NN maximum) (NN altitude)) (PP (IN during) (NP (NN flight))))) (. ?)))

Can anyone give me any pointers on how to use the tree surgeon to edit this tree?

You don't want to use adjoin in this case because adjoin is used to combine a subtree in PTB format and a node from another tree.

I think what you want to do is something like that:

Tree t = Tree.valueOf("(ROOT (SBARQ (WHNP (WP What)) (SQ (VBZ is) (NP (NP (NP (DT the) (NN cannonball) (POS 's)) (NN maximum) (NN altitude)) (PP (IN during) (NP (NN flight))))) (. ?)))");
//Pattern: http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/trees/tregex/TregexPattern.html
TregexPattern adjoinPOS = TregexPattern.compile("(POS=postag < __=pos ) $- NN=noun");
TsurgeonPattern tsurgeon = Tsurgeon.parseOperation("[move pos >-1 noun] [delete postag]");
Tsurgeon.processPattern(adjoinPOS, tsurgeon, t);

This will move the clitic 's next to cannonball (the NN will then have two children) and delete the POS node.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM