简体   繁体   中英

Change the value of the POS tag in NLTK Tree leaf

I have an nltk.tree.Tree object.

t = Tree('S', [Tree('NP', [('I','tag')]), Tree('VP', [Tree('V', [('saw','tag')]), Tree('NP', [('him','tag')])])])

I want to traverse it with the function below, and change every leaf's POS tag (ie 'tag' in the above example).

def traverse(tree):
    try:
        tree.label()
    except AttributeError:
        tree[-1] = ('another_tag')
        print(tree)
    else:
        for child in tree:
            traverse(child)

Unfortunately, every POS-tag in the leaf cannot be changed because the tuple object including it is immutable.

How can I change the POS-tags in the example without affecting it's original tree structure?

I'm fairly new to this tree structure, please show some clear excerpts how to deal with nested structure.

An nltk tree is actually just a list. With enumerate, you can loop through it and assign the node at position ia new value. Something like:

def traverse(tree):

    for index, subtree in enumerate(tree):
        if type(subtree) == nltk.tree.Tree:
            traverse(subtree)
        elif type(subtree) == tuple:
            newVal = (subtree[0], subtree[1].lower())
            subtree = newVal
            tree[index] = subtree

Because you're dealing with tuples (immutable), you cannot replace only the POStag, but have to create a new tuple. The code above just makes the tag lowercase, but you can put in anything you like as second element of the newVal tuple.

Hope this helps!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM