简体   繁体   English

从Wordnet获取属性

[英]Getting attributes from wordnet

From an article I read that It is possible to find whether a word has attributes or not by using WordNet. 从一篇文章中我读到,可以通过使用WordNet查找单词是否具有属性。 For example the word size has attributes – big, small similarly the word quality has attributes: inferior, superior etc. Can anyone please tell me how to do it in java(java or R)? 例如,单词大小具有属性–大,小同样,单词质量具有属性:劣等,优等。有人可以告诉我如何在java(java或R)中进行操作吗? . Thanks in Advance. 提前致谢。

I believe you can achieve much of what you want using Tyler Rinker's very useful qdap package. 我相信您可以使用Tyler Rinker的非常有用的qdap软件包来实现大部分qdap More specifically, the synonyms function. 更具体地说, synonyms起作用。

require(qdap)
synonyms(c("size", "quality"))

$size.def_1
 [1] "amount"          "bigness"         "bulk"            "dimensions"      "extent"          "greatness"       "hugeness"       
 [8] "immensity"       "largeness"       "magnitude"       "mass"            "measurement (s)" "proportions"     "range"          
[15] "vastness"        "volume"         

$size.def_2
 [1] "diminutive"     "little"         "midget"         "miniature"      "pocket"         "pygmy or pigmy" "small"         
 [8] "teensy-weensy"  "teeny-weeny"    "tiny"           "wee"           

$size.def_3
[1] "appraise"              "assess"                "evaluate"              "eye up"                "get (something) taped"

$quality.def_1
[1] "aspect"         "attribute"      "characteristic" "condition"      "feature"        "mark"           "peculiarity"   
[8] "property"       "trait"         

$quality.def_2
[1] "character"    "constitution" "description"  "essence"      "kind"         "make"         "nature"       "sort"        

$quality.def_3
 [1] "calibre"      "distinction"  "excellence"   "grade"        "merit"        "position"     "pre-eminence" "rank"        
 [9] "standing"     "status"       "superiority"  "value"        "worth"       

$quality.def_4
[1] "aristocracy"  "gentry"       "nobility"     "ruling class" "upper class" 

Assign the synonyms to a list object and extract what you want. 将同义词分配给列表对象,然后提取所需的内容。

 attributes <- synonyms(c("size", "quality"))

Also, here is a related Stack Overflow question: Identifying near duplicate entries using synonyms in R 另外,这是一个相关的堆栈溢出问题: 使用R中的同义词识别几乎重复的条目

In Wordnet there are links between words (well, to be precise, between synsets). 在Wordnet中,单词之间存在链接(准确地说,是同义词集之间)。 One of those possible links is the "attribute" link. 这些可能的链接之一是“属性”链接。 So you can see here how the first meaning of the word size has two attribute links, one to the adjective large, and another to the adjective small: 因此,您可以在这里看到单词大小的第一个含义如何具有两个属性链接,一个属性链接到形容词大,另一个属性链接到形容词小:

http://wordnetweb.princeton.edu/perl/webwn?o2=1&o0=1&o8=1&o1=1&o7=1&o5=1&o9=&o6=1&o3=1&o4=1&s=size&i=2&h=1000000000001000#c http://wordnetweb.princeton.edu/perl/webwn?o2=1&o0=1&o8=1&o1=1&o7=1&o5=1&o9=&o6=1&o3=1&o4=1&s=size&i=2&h=1000000000001000#c

To get this information with the Java API, you use the getAttributes() function of a noun synset. 要使用Java API获得此信息,请使用名词同义词集的getAttributes()函数。 So first use search to get the noun synset for the first meaning of the word "size", then call getAttributes() on it, and iterate through those. 因此,首先使用search获得单词“ size”的第一个含义的名词同义词集,然后对其调用getAttributes()并对其进行迭代。 (The R wordnet API appears to be a wrapper around the java API, so it should be the same idea.) (R wordnet API似乎是java API的包装,因此应该是相同的想法。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM