简体   繁体   English

在synset中的wordnet单词短语

[英]wordnet word phrase in synset

How can we find the word phrases in a synset ? 我们怎样才能在synset中找到单词短语? In particular, take this synset for the adj "booked": 特别是,将此synset用于adj“预订”:

booked, engaged, set-aside -- (reserved in advance) 预订,订婚,预留 - (提前预订)

I use the RitaWN Java package (WordNet version is 2.1), and cannot seem to find the phrases. 我使用RitaWN Java包(WordNet版本是2.1),似乎无法找到这些短语。 In the example above, when I run 在上面的例子中,当我跑

RiWordnet wordnet = new RiWordnet(null);
String[] syn = wordnet.getSynset(word, "a", true);
for(int i = 0; i < syn.length; i++)
            System.out.println(syn[i]);

It only outputs 它只输出

booked engaged 预约订婚了

While "set-aside" is not listed. 虽然没有列出“预留”。

I have tested a lot and all phrases are not found. 我已经测试了很多,并且找不到所有短语。 Another example: 另一个例子:

commodity, trade good, good -- (articles of commerce) 商品,贸易良好,良好 - (商品)

then "trade good" is not returned from the getSynset() method. 那么“trade good”不会从getSynset()方法返回。 So how can we actually get phrases ? 那么我们怎样才能真正得到短语?

(the ritawn package is obtained from http://rednoise.org/rita/wordnet/documentation/index.htm ) (ritawn包来自http://rednoise.org/rita/wordnet/documentation/index.htm

This answer is a bit off right field but in any case... 这个答案有点偏离正确的领域,但无论如何......

Idilia has an online Wordnet-like database that is actually much more complete and richer than Wordnet. Idilia有一个类似于Wordnet的在线数据库,实际上比Wordnet更加完整和丰富。 Depending on where you are in your application it may make sense so I'm mentioning it. 根据您在应用程序中的位置,它可能有意义所以我提到它。 There are coding examples for Java access on the site. 网站上有Java访问的编码示例。

In this case the query: 在这种情况下查询:

[{"fs":"booked/J1", "lemma":[], "definition":null}] [{“fs”:“reservations / J1”,“lemma”:[],“definition”:null}]

would return 会回来的

{ "fs" : "booked/J1", "lemma" : [ "set_aside", "set-aside", "engaged", "booked" ], "definition" : "reserved in advance." {“fs”:“预订/ J1”,“引理”:[“set_aside”,“预留”,“订婚”,“预订”],“定义”:“提前预订”。 } }

RiTaWN seems to ignore "compound-words" by default. RiTaWN似乎默认忽略“复合词”。 You can disable this to get the full list of phrases (line 2 below). 您可以禁用此选项以获取完整的短语列表(下面的第2行)。

RiWordnet wordnet = new RiWordnet();
wordnet.ignoreCompoundWords(false);
String[] syn = wordnet.getSynset("booked", "a", true);
System.out.println(Arrays.asList(syn));

Result: 结果:

[INFO] RiTa.WordNet.version [033]
[booked, engaged, set-aside] 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM