迭代一个同义词集列表到另一个

Question

I have two sets of wordnet synsets (contained in two separate list objects, s1 and s2), from which I want to find the maximum path similarity score for each synset in s1 onto s2 with the length of output equal that of s1. 我有两套wordnet同义词集（包含在两个单独的列表对象s1和s2中），我想从中找到s1到s2上每个同义词集的最大路径相似性得分，其输出长度等于s1。 For example, if s1 contains 4 synsets, then the length of output should be 4. 例如，如果s1包含4个同义词集，则输出的长度应为4。

I have experimented with the following code (so far): 我已经尝试了以下代码（到目前为止）：

 import numpy as np import nltk from nltk.corpus import wordnet as wn import pandas as pd #two wordnet synsets (s1, s2) s1 = [wn.synset('be.v.01'), wn.synset('angstrom.n.01'), wn.synset('trial.n.02'), wn.synset('function.n.01')] s2 = [wn.synset('use.n.01'), wn.synset('function.n.01'), wn.synset('check.n.01'), wn.synset('code.n.01'), wn.synset('inch.n.01'), wn.synset('be.v.01'), wn.synset('correct.v.01')] # define a function to find the highest path similarity score for each synset in s1 onto s2, with the length of output equal that of s1 ps_list = [] def similarity_score(s1, s2): for word1 in s1: best = max(wn.path_similarity(word1, word2) for word2 in s2) ps_list.append(best) return ps_list ps_list(s1, s2)

But it returns this following error message 但它返回以下错误消息

'>' not supported between instances of 'NoneType' and 'float'

I couldn't figure out what's going on with code. 我不知道代码是怎么回事。 Would anyone care to take a look at my code and share his/her insights on the for loop? 有人愿意看一下我的代码并在for循环上分享他/她的见解吗？ It will be really appreciated. 我们将不胜感激。

Thank you. 谢谢。

The full error traceback is here 完整的错误回溯在这里

 TypeError Traceback (most recent call last) <ipython-input-73-4506121e17dc> in <module>() 38 return word_list 39 ---> 40 s = similarity_score(s1, s2) 41 42 <ipython-input-73-4506121e17dc> in similarity_score(s1, s2) 33 def similarity_score(s1, s2): 34 for word1 in s1: ---> 35 best = max(wn.path_similarity(word1, word2) for word2 in s2) 36 word_list.append(best) 37 TypeError: '>' not supported between instances of 'NoneType' and 'float'

[edit] I came up with this temporary solution: [编辑]我想出了这个临时解决方案：

 s_list = [] for word1 in s1: best = [word1.path_similarity(word2) for word2 in s2] b = pd.Series(best).max() s_list.append(b)

It's not elegant but it works. 它不优雅，但可以。 Wonder if anyone have better solutions or handy tricks to handle this? 想知道是否有人有更好的解决方案或方便的技巧来解决此问题？

Answer 1

I have no experience with the nltk module, but from reading the docs I can see that path_similarity is a method of whatever object wn.synset(args) returns. 我没有使用nltk模块的经验，但是通过阅读文档，我可以看到path_similarity是任何对象wn.synset(args)返回的方法。 You are instead treating it as a function. 相反，您将其视为函数。

What you should be doing, is something like this: 您应该做的是这样的：

ps_list = []
for word1 in s1:
    best = max(word1.path_similarity(word2) for word2 in s2) #path_similarity is a method of each synset
    ps_list.append(best)

Answer 2

I think the error comes from the following: 我认为错误来自以下方面：

best = max(wn.path_similarity(word1, word2) for word2 in s2)

you should add a condition if wn.path_similarity(word1, word2) is NoneType, then you cannot do max() , for instance you can re-write like this: 如果wn.path_similarity（word1，word2）为NoneType，则应添加一个条件，则不能执行max（） ，例如，可以像这样重写：

best = max([word1.path_similarity(word2) for word2 in s2 if word1.path_similarity(word2) is not None])

迭代一个同义词集列表到另一个

问题描述

2 个解决方案

解决方案1
1 已采纳 2017-08-30 20:23:58

解决方案2
1 2017-10-08 19:55:22

迭代一个同义词集列表到另一个

问题描述

2 个解决方案

解决方案1 1 已采纳 2017-08-30 20:23:58

解决方案2 1 2017-10-08 19:55:22

解决方案1
1 已采纳 2017-08-30 20:23:58

解决方案2
1 2017-10-08 19:55:22