[英]What is the most efficient way of computing similarity between two dictionnaries of lists?
I want to compute accuracy using sets logic.我想使用集合逻辑计算准确性。 I'll explain with an example:
我会用一个例子来解释:
For these two dictionnaries:对于这两个字典:
d1 = {1: {'hello', 'goodbye'}, 2:{'sayonnara'}, 3:{'origami'}}
d2 = {1: {'goodbye'}, 2:{'hola', 'bye'}, 3:{'bird','origami','giraffe'}}
I want to get this result:我想得到这个结果:
{1: 0.5, 2: 0, 3: 0.3333333333333333}
I'm doing it this way:我是这样做的:
from collections import defaultdict
acc=defaultdict(list)
for (k,v1) in d1.items():
for (k,v) in d2.items():
nb=len(v1.intersection(v))
if (nb>0):
print(nb)
acc[k] = nb/ (abs(len(v) - len(v1))+1)
print(acc)
if k not in acc.keys():
acc[k] = 0
Is there a more efficient solution than this?还有比这更有效的解决方案吗?
If we operate under the assumption that both dicts will have the same keys, this can be done with a dict comprehension with a single loop:如果我们假设两个 dict 都具有相同的键,那么这可以通过带有单个循环的 dict 理解来完成:
print({k1: (len(v1.intersection(d2[k1])) / (abs(len(v1) - len(d2[k1])) + 1))
for k1, v1 in d1.items()})
outputs产出
{1: 0.5, 2: 0.0, 3: 0.3333333333333333}
This can be generalized a bit by making sure we take into account only the common keys between the two dicts, just to be on the safe side.这可以通过确保我们只考虑两个字典之间的公共键来概括,只是为了安全起见。
print({common_key: (len(d1[common_key].intersection(d2[common_key])) / (abs(len(d1[common_key]) - len(d2[common_key])) + 1))
for common_key in d1.keys() & d2.keys()})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.