简体   繁体   中英

Domain of words in wordnet

In wordnet there are number of words classified in noun,adjective,advarb and verb files separately. How can we get the domain of some words or words in paricular domain using wordnet?

For example, suppose i have some words like (bark,dog,cat) and all these terms are related to animal. But how can we get to know this through wordnet? Is there any mechanism for this?

You cannot relate verbs like "bark" to the "animal" cluster directly based on WordNet. You can, however, relate dog , cat , etc. as being different kinds of animals by searching the hypernyms of these terms. WordNet has a tree-structure where any word is-a member of a category. Traveling up this category-tree from any word will eventually lead you to the root of this tree called entity .

Therefore, you can use the notion of the lowest common ancestor (LCA) of two words in this category-tree. If the LCA of two words is animal or a hyponym of animal , then both are related. So, if you start with some prior knowledge (say, "dog is an animal"), then you can add other animals to this cluster by following this algorithm.

To also include terms like "bark", "moo", etc., you will need to employ more complex distance measures. These are metrics that look into different types of tree-based relationships (eg the path score or the Wu-Palmer score) or the extent of overlap between the dictionary definitions of the words (eg LESK).

For example, the LESK score between "dog" and "bark" is 158, while between "dog" and "catapult" is 39. A high score thus indicates that the words belong to the same (or similar) category.

A good software package (in Java) where such distance measures are provided is the WS4J package. They have an online demo here .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM