简体   繁体   English

使用Stanford CoreNLP Python解析器获取特定输出

[英]Using Stanford CoreNLP Python Parser for specific output

I'm using SCP to get the parse CFG tree for English sentences. 我正在使用SCP获取英语句子的解析CFG树。

from corenlp import *
corenlp = StanfordCoreNLP()
corenlp.parse("Every cat loves a dog")

My expected output is a tree like this: 我的预期输出是这样的树:

(S (NP (DET Every) (NN cat)) (VP (VT loves) (NP (DET a) (NN dog))))

But what i got is: 但是我得到的是:

(ROOT (S (NP (DT Every) (NN cat)) (VP (VBZ loves) (NP (DT a) (NN dog)))))

How to change the POS tag as expected and remove the ROOT node? 如何按预期更改POS标签并删除ROOT节点?

Thanks 谢谢

You can use nltk.tree module from NLTK . 您可以使用NLTK中的 nltk.tree模块。

from nltk.tree import *

def traverse(t):
    try:
        # Replace Labels
        if t.label() == "DT":
            t.set_label("DET")
        elif t.label() == "VBZ":
            t.set_label("VT")   
    except AttributeError:
        return

    for child in t:
        traverse(child)

output_tree= "(ROOT (S (NP (DT Every) (NN cat)) (VP (VBZ loves) (NP (DT a) (NN dog)))))"
tree = ParentedTree.fromstring(output_tree)

# Remove ROOT Element
if tree.label() == "ROOT":  
    tree = tree[0]

traverse(tree)
print tree  
# (S (NP (DET Every) (NN cat)) (VP (VT loves) (NP (DET a) (NN dog))))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM