获取 NLTK Tree() 的类型

Question

我正在使用 NLTK 的 Semcor 模块：

nltk.download('semcor')
from nltk.corpus import semcor

semcor.tagged_sents()使用包括 WordNet 引理标识符在内的附加注释迭代相同的句子。

semcor.tagged_sents(tag="sem")[0]
>>> [['The'],
 Tree(Lemma('group.n.01.group'), [Tree('NE', ['Fulton', 'County', 'Grand', 'Jury'])]),
 Tree(Lemma('state.v.01.say'), ['said']),
 Tree(Lemma('friday.n.01.Friday'), ['Friday']),
 ['an'],
 Tree(Lemma('probe.n.01.investigation'), ['investigation']),
 ['of'],
 Tree(Lemma('atlanta.n.01.Atlanta'), ['Atlanta']),
 ["'s"],
 Tree(Lemma('late.s.03.recent'), ['recent']),
 Tree(Lemma('primary.n.01.primary_election'), ['primary', 'election']),
 Tree(Lemma('produce.v.04.produce'), ['produced']),
 ['``'],
 ['no'],
 Tree(Lemma('evidence.n.01.evidence'), ['evidence']),
 ["''"],
 ['that'],
 ['any'],
 Tree(Lemma('abnormality.n.04.irregularity'), ['irregularities']),
 Tree(Lemma('happen.v.01.take_place'), ['took', 'place']),
 ['.']]

当我在此列表中使用索引时，我得到以下 output：

semcor.tagged_sents(tag="sem")[0][1][0]
>>> Tree('NE', ['Fulton', 'County', 'Grand', 'Jury'])

当我再使用一个索引时，我会从列表中获取标记为 output：

semcor.tagged_sents(tag="sem")[0][1][0][0]
>>> 'Fulton'

我的目标有两个：

我可以使用什么代码来获得引理为 output？ 所以 output 将是：

>>> Tree(Lemma('group.n.01.group')

我可以使用什么代码来获取 output 的树类型？ 在此示例中：

>>> 'NE'

Answer 1

semcor.tagged_sents(tag="sem")[0][1].label()
#'group.n.01'

semcor.tagged_sents(tag="sem")[0][1][0].label()
#'NE'

获取 NLTK Tree() 的类型

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-06-19 14:52:41

获取 NLTK Tree() 的类型

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-06-19 14:52:41

解决方案1
1 已采纳 2020-06-19 14:52:41