The tree pretty print of the nltk.Tree class prints in the following format :
print spacy2tree(nlp(u'Williams is a defensive coach') )
(S
(SUBJ Williams/NNP)
(PRED is/VBZ test/VBN)
a/DT
defensive/JJ
coach/NN)
as Tree :
spacy2tree(nlp(u'Williams is a defensive coach') )
Tree('S', [Tree('SUBJ', [(u'Williams', u'NNP')]),
Tree('PRED', [(u'is', u'VBZ'), ('test', 'VBN')]), (u'a', u'DT'), (u'defensive', u'JJ'), (u'coach', u'NN')])
but dosen't ingest it correctly :
tfs = spacy2tree(nlp(u'Williams is a defensive coach') ).pformat()
Tree.fromstring(tfs)
Tree('S', [Tree('SUBJ', ['Williams/NNP']),
Tree('PRED', ['is/VBZ', 'test/VBN']), 'a/DT', 'defensive/JJ', 'coach/NN'])
example :
correct incorrect
('SUBJ', [(u'Williams', u'NNP')]) =vs=> ('SUBJ', ['Williams/NNP'])
('PRED', [(u'is', u'VBZ'), ('test', 'VBN')]) =vs=> ('PRED', ['is/VBZ', 'test/VBN'])
is there a utility to ingest Tree from string correctly ??
Seems that I figured it out :
: Tree.fromstring(tfs, read_leaf=lambda s : tuple(s.split('/')))
: Tree('S', [Tree('SUBJ', [(u'Williams', u'NNP')]),
Tree('PRED', [(u'is', u'VBZ'), (u'test', u'VBN')]), (u'a', u'DT'), (u'defensive', u'JJ'), (u'coach', u'NN')])
So now this works correctly too :
: tree2conlltags(Tree.fromstring(tfs, read_leaf=lambda s : tuple(s.split('/'))))
:
[(u'Williams', u'NNP', u'B-SUBJ'),
(u'is', u'VBZ', u'B-PRED'),
(u'test', u'VBN', u'I-PRED'),
(u'a', u'DT', u'O'),
(u'defensive', u'JJ', u'O'),
(u'coach', u'NN', u'O')]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.