简体   繁体   中英

Navigate an NLTK tree (follow-up)

I've asked the question how I can properly navigate through an NTLK tree.

How do I properly navigate through an NLTK tree (or ParentedTree)? I would like to identify a certain leaf with the parent node "VBZ", then I would like to move from there further up the tree and to the left to identify the NP node.

Original question

And provided the following illustration:

NLTK树

I got the following (very helpful) answer from Tommy (thank you!):

from nltk.tree import *

np_trees = []

def traverse(t):
    try:
        t.label()
    except AttributeError:
        return

    if t.label() == "VBZ":
        current = t
         while current.parent() is not None:

            while current.left_sibling() is not None:

                 if current.left_sibling().label() == "NP":
                    np_trees.append(current.left_sibling())

                current = current.left_sibling()

            current = current.parent()

    for child in t:
         traverse(child)

 tree = ParentedTree.fromstring("(S (NP (NNP)) (VP (VBZ) (NP (NNP))))")
 traverse(tree)
 print np_trees # [ParentedTree('NP', [ParentedTree('NNP', [])])]

But how can I include the condition that I only extract those NP nodes that have an NNP child node?

Any help would be much appreciated again.

(Generally, if there is any expert on NLTK trees among you, I would love to chat with you and pay a few coffees in exchange for a bit of insight.)

I usually use the subtrees function in combination with a filter for this. Changing your tree slightly to show that it only selects one of the NP's now:

>>> tree = ParentedTree.fromstring("(S (NP (NNP)) (VP (VBZ) (NP (NNS))))")
>>> for st in tree.subtrees(filter = lambda x: x.label() == "NP" and x[0].label() == 'NNP'):
...     print(st)
... 
(NP (NNP ))

This may crash however, when your subtree/x[0] doesn't have a label (when it's a terminal, for example). Or throw an IndexError when your NP is completely empty. But I'd say those scenario's arent very likely. However, quite possibly I'm overseeing things here and you may want to build in some extra checks...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM