简体   繁体   中英

Dependency tree using Stanford Parser from NLTK results not matching with Stanford Parser

I am trying to compare results of Stanford Parser from NLTK, but I do not know why I am getting different results when I compare with stanford parser I have checked related questions but this does not help me much.

stan_dep_parser = StanfordDependencyParser() # stanford parser from NLTK 
dependency_parser =stan_dep_parser.raw_parse("Four men died in an accident")
dep = dependency_parser.next()
for triple in dep.triples():
   print triple[1],"(",triple[0][0],", ",triple[2][0],")"

Current Output:

nsubj ( died ,  men )
nummod ( men ,  Four )
nmod ( died ,  accident )
case ( accident ,  in )
det ( accident ,  an )

Expected Output according to stanford parser :

nummod(men-2, Four-1)
nsubj(died-3, men-2)
root(ROOT-0, died-3)
case(accident-6, in-4)
det(accident-6, an-5)
nmod(died-3, accident-6)

NLTK version: 3.2.4 Stanford Parser: stanford-parser-3.8.0-models

I have solved problem myself:

I found 'root' or 'head' of the sentence:

final_dependency = []
sentence = "Four men died in an accident"
dependency_tree = StanfordDependencyParser()
dependency_parser = dependency_tree.raw_parse(sentence)
parsetree = list(dependency_parser)[0]
for k in parsetree.nodes.values():
       if k["head"] == 0:
            final_dependency.append(str(k["rel"])  + "(" + "Root" + "-" 
                + str(k["head"]) + "," + str(k["word"]) + "-" + str(k["address"]) + ")" )

Then I added numbers with words as in expected output with simple string operations as numbers are indexes of each word in sentence.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM