简体   繁体   中英

Why isn't WSD matching WordNet?

I'm getting to grips with WSD and WordNet and I'm trying to work out why they are outputting different results. My understanding when using the below code is that the disambiguate command nominates the most likely Synset:

from pywsd import disambiguate
from nltk.corpus import wordnet as wn

mysent = 'I went to have a drink in a bar'

wsd = disambiguate(mysent)

Which gives me the below output

('I', None)
('went', Synset('travel.v.01'))
('to', None)
('have', None)
('a', None)
('drink', Synset('swallow.n.02'))
('in', None)
('a', None)
('bar', Synset('barroom.n.01'))

From this, I find it odd that the word 'I' was returned as 'nonetype' given that when looking up the word in WordNet I get one of four possible interpretations. Surely, 'I' should correspond to at least one of them?

wordnet.synsets('I')

Out:
[Synset('iodine.n.01'), Synset('one.n.01'), Synset('i.n.03'), Synset('one.s.01')]

In your sentence above, 'I' is a pronoun. The wordnet FAQ states that:

Q: Why is WordNet missing: of, an, the, and, about, above, because, etc.

A: WordNet only contains "open-class words": nouns, verbs, adjectives, and adverbs. Thus, excluded words include determiners, prepositions, pronouns, conjunctions, and particles.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM