简体   繁体   中英

Choosing appropriate sense of a word from wordnet

I am using Wordnet for finding synonyms of ontology concepts. How can i find choose the appropriate sense for my ontology concept. eg there is an ontlogy concept "conference" it has following synsets in wordnet The noun conference has 3 senses (first 3 from tagged texts)

  1. (12) conference -- (a prearranged meeting for consultation or exchange of information or discussion (especially one with a formal agenda))
  2. (2) league, conference -- (an association of sports teams that organizes matches for its members)
  3. (2) conference, group discussion -- (a discussion among participants who have an agreed (serious) topic) now 1st and 3rd synsets have apprpriate sense for my ontology concept. How can i choose only these two from wordnet?

The technology you're looking for is in the direction of semantic disambiguation / representation .

The most "traditional approach" is Word Sense Disambiguation (WSD), take a look at

Then comes the next generation of Word Sense induction / Topic modelling / Knowledge representation :

Then comes the most recent hype:

  • Word embeddings, vector space models, neural nets

Sometimes people skip the semantic representation and goes directly to do text similarity and by comparing pairs of sentences, the differences/similarities before getting to the ultimate aim of the text processing.

Take a look at Normalize ranking score with weights for a list of STS related work.

On the other direction, there's

There's also a recent task on ontology induction / expansion:

Depending on the ultimate task, maybe either of the above technology would help.

You can also try Babelfy, which provides Word Sense Disambiguation and Named Entity Disambiguation.

Demo: http://babelfy.org/

API: http://babelfy.org/guide

Take a look at this list: 100 Best GitHub: Word-sense Disambiguation and search by WordNet - there are several appropriate libraries.

I didn't use any of them, but this one seems to be promising, because it is based on classic yet effective idea (namely, Lesk algorithm ) upgraded by modern word-embedding methods. Actually, before finding it, I was going to suggest to try almost the same ideas.

Note also that all methods try to find the meaning (WordNet sysnet, in your case) that is most similar to the context of the current word/collocation, so it is crucial to have context of the words you're trying to disambiguate. For example, words can come from some text and most libraries rely on that.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM