简体   繁体   中英

Filter Stanford Dependency Parser Output

How can I modify this code to get only one particular output from the code. For example how can I get just 'nmod' or 'dobj' in output?

from nltk.parse.stanford import StanfordDependencyParser
from nltk.tokenize import word_tokenize
from nltk.tree import Tree
stanford_models = 'E:\stanford-parser\stanford-parser-3.7.0-models.jar'
stanford_jar = 'E:\stanford-parser\stanford-parser.jar'
st = StanfordDependencyParser(stanford_models, stanford_jar, encoding='utf-8')
text = 'Randy,Can you send me a schedule of the salary.'
result= st.raw_parse(text)
dep = result.__next__()
list(dep.triples())

The output is:

[(('send', 'VB'), 'discourse', ('Randy', 'UH')),
 (('send', 'VB'), 'aux', ('Can', 'MD')),
 (('send', 'VB'), 'nsubj', ('you', 'PRP')),
 (('send', 'VB'), 'iobj', ('me', 'PRP')),
 (('send', 'VB'), 'dobj', ('schedule', 'NN')),
 (('schedule', 'NN'), 'det', ('a', 'DT')),
 (('schedule', 'NN'), 'nmod', ('salary', 'NN')),
 (('salary', 'NN'), 'case', ('of', 'IN')),
 (('salary', 'NN'), 'det', ('the', 'DT'))]

The only thing you have to do is filter(..) and perhaps convert back to a list(..) :

the_triples = list(dep.triples()) #you already have this line
result = filter(lambda v : v[1] == 'nmod' or v[1] == 'dobj',the_triples)

When you run , result will be a list, if you work with , the result will be a generator (and thus processing is delayed until you really need the values). You can convert the generator to a list by calling list(..) on it.

filter(function,iterable) takes as input a function and an iterable. As iterable we feed it the list of triples, as function we use v : v[1] == 'nmod' or v[1] == 'dobj' which is a function that takes the triple and succeeds given the second element of the triple is either 'nmod' or 'dobj' . So given the function evaluates the triple to True , the element will be emitted, otherwise it will be ignored.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM