如何获得给定偏移 ID 的 WordNet 同义词集？

Question

我有一个 WordNet 同义词偏移量（例如id="n#05576222" ）。 鉴于此偏移量，如何使用 Python 获取同义词集？

Answer 1

从 NLTK 3.2.3 开始，有一个公共方法可以做到这一点：

wordnet.synset_from_pos_and_offset(pos, offset)

在早期版本中，您可以使用：

wordnet._synset_from_pos_and_offset(pos, offset)

这将根据它的 POS 和 offest ID 返回一个同义词集。 我认为此方法仅在 NLTK 3.0 中可用，但我不确定。

例子：

from nltk.corpus import wordnet as wn
wn.synset_from_pos_and_offset('n',4543158)
>> Synset('wagon.n.01')

Answer 2

对于 NTLK 3.2.3 或更新版本，请参阅 donners45 的回答。

对于旧版本的 NLTK：

NLTK 中没有内置方法，但您可以使用它：

from nltk.corpus import wordnet

syns = list(wordnet.all_synsets())
offsets_list = [(s.offset(), s) for s in syns]
offsets_dict = dict(offsets_list)

offsets_dict[14204095]
>>> Synset('heatstroke.n.01')

然后，您可以腌制字典并在需要时加载它。

对于 3.0 之前的 NLTK 版本，替换该行

offsets_list = [(s.offset(), s) for s in syns]

和

offsets_list = [(s.offset, s) for s in syns]

因为在 NLTK 3.0 之前， offset是一个属性而不是一个方法。

Answer 3

您可以使用of2ss() ，例如：

from nltk.corpus import wordnet as wn
syn = wn.of2ss('01580050a')

将返回Synset('necessary.a.01')

Answer 4

除了使用 NLTK，另一种选择是使用来自Open Multilingual WordNet http://compling.hss.ntu.edu.sg/omw/的 .tab 文件用于普林斯顿 WordNet。 通常，我使用下面的方法来访问 wordnet 作为字典，以偏移量作为键和; 分隔字符串作为值：

# Gets first instance of matching key given a value and a dictionary.    
def getKey(dic, value):
  return [k for k,v.split(";") in dic.items() if v in value]

# Read Open Multi WN's .tab file
def readWNfile(wnfile, option="ss"):
  reader = codecs.open(wnfile, "r", "utf8").readlines()
  wn = {}
  for l in reader:
    if l[0] == "#": continue
    if option=="ss":
      k = l.split("\t")[0] #ss as key
      v = l.split("\t")[2][:-1] #word
    else:
      v = l.split("\t")[0] #ss as value
      k = l.split("\t")[2][:-1] #word as key
    try:
      temp = wn[k]
      wn[k] = temp + ";" + v
    except KeyError:
      wn[k] = v  
  return wn

princetonWN = readWNfile('wn-data-eng.tab')
offset = "n#05576222"
offset = offset.split('#')[1]+'-'+ offset.split('#')[0]

print princetonWN.split(";")
print getKey('heatstroke')

如何获得给定偏移 ID 的 WordNet 同义词集？

问题描述

4 个解决方案

解决方案1
26 2014-11-26 09:37:04

解决方案2
14 2012-09-11 21:53:53

解决方案3
7 2017-03-20 14:36:28

解决方案4
1 2013-02-02 02:21:28

如何获得给定偏移 ID 的 WordNet 同义词集？

问题描述

4 个解决方案

解决方案1 26 2014-11-26 09:37:04

解决方案2 14 2012-09-11 21:53:53

解决方案3 7 2017-03-20 14:36:28

解决方案4 1 2013-02-02 02:21:28

解决方案1
26 2014-11-26 09:37:04

解决方案2
14 2012-09-11 21:53:53

解决方案3
7 2017-03-20 14:36:28

解决方案4
1 2013-02-02 02:21:28