简体   繁体   中英

Given two words, find whether they are in the same synset

am fairly new to nltk. I am trying to find out a solution to the problem I am currently working on:

  • Given two words w1 and w2 is there a way to find out whether they belong to the same sysnet in the Wordnet database?
  • Also is it possible to find the list of sysnets that contain a given word?

Thanks.

Also is it possible to find the list of sysnets that contain a given word?

Yes :

>>> from nltk.corpus import wordnet as wn
>>> auto, car = 'auto', 'car'
>>> wn.synsets(auto)
[Synset('car.n.01')]
>>> wn.synsets(car)
[Synset('car.n.01'), Synset('car.n.02'), Synset('car.n.03'), Synset('car.n.04'), Synset('cable_car.n.01')]

If we look at lemmas in every synset from wn.synsets(car) , we'll find "car" exist as one of the lemma:

>>> for ss in wn.synsets(car):
...     assert 'car' in ss.lemma_names()
... 
>>> for ss in wn.synsets(car):
...     print 'car' in ss.lemma_names(), ss.lemma_names()
... 
True [u'car', u'auto', u'automobile', u'machine', u'motorcar']
True [u'car', u'railcar', u'railway_car', u'railroad_car']
True [u'car', u'gondola']
True [u'car', u'elevator_car']
True [u'cable_car', u'car']

Note: A lemma is not exactly a surface word, see Stemmers vs Lemmatizers , also, you might find this helpful https://github.com/alvations/pywsd/blob/master/pywsd/utils.py#L66 (Disclaimer: Shameless plug)

Given two words w1 and w2 is there a way to find out whether they belong to the same sysnet in the Wordnet database?

Yes :

>>> from nltk.corpus import wordnet as wn
>>> auto, car = 'auto', 'car'
>>> wn.synsets(auto)
[Synset('car.n.01')]
>>> wn.synsets(car)
[Synset('car.n.01'), Synset('car.n.02'), Synset('car.n.03'), Synset('car.n.04'), Synset('cable_car.n.01')]
>>> auto_ss = set(wn.synsets(auto))
>>> car_ss = set(wn.synsets(car))
>>> car_ss.intersection(auto_ss)
set([Synset('car.n.01')])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM