简体   繁体   English

NLTK wordnet界面中的第0个synset

[英]0th synset in NLTK wordnet interface

From the semcor corpus ( http://www.cse.unt.edu/~rada/downloads.html ), there are senses wasn't mapped to the later versions of wordnet. 从semcor语料库( http://www.cse.unt.edu/~rada/downloads.html ),有些感官没有映射到wordnet的更高版本。 And magically, the mapping can be found in the NLTK WordNet API as such: 奇迹般地,映射可以在NLTK WordNet API中找到:

>>> from nltk.corpus import wordnet as wn
# Emunerate the possible senses for the lemma 'delayed'
>>> wn.synsets('delayed')
[Synset('delay.v.01'), Synset('delay.v.02'), Synset('stay.v.06'), Synset('check.v.07'), Synset('delayed.s.01')]
>>> wn.synset('delay.v.01')
Synset('delay.v.01')
# Magically, there is a 0th sense of the word!!!
>>> wn.synset('delayed.a.0')
Synset('delayed.s.01')

I've checked the code and the API ( http://nltk.googlecode.com/svn/trunk/doc/api/nltk.corpus.reader.wordnet.Synset-class.html , http://nltk.org/_modules/nltk/corpus/reader/wordnet.html ) but i can't find how they did the magically mapping that didn't shouldn't exist (eg for delayed.a.0 -> delayed.s.01 ). 我检查了代码和API( http://nltk.googlecode.com/svn/trunk/doc/api/nltk.corpus.reader.wordnet.Synset-class.htmlhttp://nltk.org/ _modules / nltk / corpus / reader / wordnet.html )但是我无法找到它们如何进行不应该存在的神奇映射(例如对于delayed.a.0 - > delayed.s.01 )。

Does anyone know which part of the NLTK Wordnet API code does the magical mapping? 有谁知道NLTK Wordnet API代码的哪一部分做了神奇的映射?

It's a bug I guess. 我想这是一个错误。 When you do wn.synset('delayed.a.0') the first two lines in the method are: 当你执行wn.synset('delayed.a.0')时,方法中的前两行是:

lemma, pos, synset_index_str = name.lower().rsplit('.', 2)
synset_index = int(synset_index_str) - 1

So in this case the value of synset_index is -1 which is a valid index in python. 所以在这种情况下, synset_index的值是-1 ,这是python中的有效索引。 And it won't fail when looking up in the array of synsets whose lemma is delayed and pos is a . 当查找lemma delayedposa的同义词数组时,它不会失败。

With this behavior you can do tricky things like: 有了这种行为,你可以做一些棘手的事情:

>>> wn.synset('delay.v.-1')
Synset('stay.v.06')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM