简体   繁体   English

我如何在ntlk wordnet中获得所有共享最低最低共同上位字母的下位字母?

[英]How do I get all hyponyms that share a particular lowest common hypernym in ntlk wordnet?

Given there is a path from two common synsets to get a lowest common hypernym, it seems reasonable there should be someway to walk back and find the hyponyms that lead to that hypernym 考虑到存在从两个常见的同义词集到最低的共同上位词的路径,似乎应该合理地走回去找到导致该上位词的下位词

from nltk.corpus import wordnet as wn
alaska = wn.synset('Alaska.n.1')
california = wn.synset('California.n.1')
common_hypernym = alaska.lowest_common_hypernyms(california)[0]

common_hypernym
Synset('american_state.n.01')

common_hypernym.do_something_awesome()
['Alabama.n.1', 'Alaska.n.1', ...] #all 50 american states

Use Synset1._shortest_path_distance(Synset2) to find the hypernyms and their distances: 使用Synset1._shortest_path_distance(Synset2)查找上位词及其距离:

>>> from nltk.corpus import wordnet as wn
>>> alaska = wn.synset('Alaska.n.1')
>>> california = wn.synset('California.n.1')

>>> alaska._shortest_hypernym_paths(california)
{Synset('district.n.01'): 4, Synset('location.n.01'): 6, Synset('region.n.03'): 5, Synset('physical_entity.n.01'): 8, Synset('entity.n.01'): 9, Synset('state.n.01'): 2, Synset('administrative_district.n.01'): 3, Synset('object.n.01'): 7, Synset('alaska.n.01'): 0, Synset('*ROOT*'): 10, Synset('american_state.n.01'): 1}

Now find the minimum path: 现在找到最小路径:

>>> paths = alaska._shortest_hypernym_paths(california)
>>> min(paths, key=paths.get)
Synset('alaska.n.01')

Now, this is boring because california and alaska are sister nodes on the WordNet hierarchy. 现在,这很无聊,因为californiaalaska是WordNet层次结构上的姐妹节点。 Let's filter out all sisters nodes: 让我们过滤掉所有姐妹节点:

>>> paths = {k:v for k,v in paths.items() if v > 0}
>>> min(paths, key=paths.get)
Synset('american_state.n.01')

To get the children nodes of the american_state (I supposed this is the "something awesome" you need...): 要获得的子节点american_state (我认为这是“东西真棒”你需要...):

>>> min(paths, key=paths.get).hyponyms()
[Synset('free_state.n.02'), Synset('slave_state.n.01')]
>>> list(min(paths, key=paths.get).closure(lambda s:s.hyponyms()))
[Synset('free_state.n.02'), Synset('slave_state.n.01')]

This might look shocking but actually, there's no hypernyms indicated for alaska or california : 这可能看起来令人震惊,但实际上,没有为alaskacalifornia指定过任何高音:

>>> alaska.hypernyms()
[]
>>> california.hypernyms()
[]

And the connection made using the _shortest_hypernym_paths is by means of a dummy root, take a look at Is wordnet path similarity commutative? 并且使用_shortest_hypernym_paths建立的连接是通过虚拟根进行的,看看wordnet路径相似性是可交换的吗?

Newer solution is: 较新的解决方案是:

alaska = wordnet.synset('Alaska.n.1')
california = wordnet.synset('California.n.1')
alaska.lowest_common_hypernyms(california)

[Synset('american_state.n.01')] [Synset('american_state.n.01')]

This old function is private and doesn't work this way, maybe other but anyways, you can also choose x.common.hypernyms(y) to find all common items. 这个旧函数是私有的,不能以这种方式工作,也许其他方法x.common.hypernyms(y) ,但是您也可以选择x.common.hypernyms(y)来查找所有常见项目。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM