计算两个列表字典之间的相似度的最有效方法是什么？

Question

I want to compute accuracy using sets logic.我想使用集合逻辑计算准确性。 I'll explain with an example:我会用一个例子来解释：

For these two dictionnaries:对于这两个字典：

d1 = {1: {'hello', 'goodbye'}, 2:{'sayonnara'}, 3:{'origami'}}
d2 = {1: {'goodbye'}, 2:{'hola', 'bye'}, 3:{'bird','origami','giraffe'}}

I want to get this result:我想得到这个结果：

{1: 0.5, 2: 0, 3: 0.3333333333333333}

I'm doing it this way:我是这样做的：

from collections import defaultdict
acc=defaultdict(list)
for (k,v1) in d1.items():
    for (k,v) in d2.items():
        nb=len(v1.intersection(v))
        if (nb>0):
            print(nb)
            acc[k] = nb/ (abs(len(v) - len(v1))+1)
            print(acc)
        if k not in acc.keys():
            acc[k] = 0

Is there a more efficient solution than this?还有比这更有效的解决方案吗？

Answer 1

If we operate under the assumption that both dicts will have the same keys, this can be done with a dict comprehension with a single loop:如果我们假设两个 dict 都具有相同的键，那么这可以通过带有单个循环的 dict 理解来完成：

print({k1: (len(v1.intersection(d2[k1])) / (abs(len(v1) - len(d2[k1])) + 1))
       for k1, v1 in d1.items()})

outputs产出

{1: 0.5, 2: 0.0, 3: 0.3333333333333333}

This can be generalized a bit by making sure we take into account only the common keys between the two dicts, just to be on the safe side.这可以通过确保我们只考虑两个字典之间的公共键来概括，只是为了安全起见。

print({common_key: (len(d1[common_key].intersection(d2[common_key])) / (abs(len(d1[common_key]) - len(d2[common_key])) + 1))
       for common_key in d1.keys() & d2.keys()})

计算两个列表字典之间的相似度的最有效方法是什么？

问题描述

1 个解决方案

解决方案1
3 已采纳 2020-01-27 11:17:19

计算两个列表字典之间的相似度的最有效方法是什么？

问题描述

1 个解决方案

解决方案1 3 已采纳 2020-01-27 11:17:19

解决方案1
3 已采纳 2020-01-27 11:17:19