[英]How to unfold a python dictionary of lists based on key-value “pairs”?
I have an algorithmic problem using a Python3.x dictionary of lists, though perhaps another data structure is more appropriate. 我有一个使用Python3.x列表字典的算法问题,尽管也许其他数据结构更合适。
Let's say I have the following Python dictionary: 假设我有以下Python字典:
dict1 = {1:[4, 12, 22], 2:[4, 5, 13, 23], 3:[7, 15, 25]}
The key 1
associate with the value [4, 12, 22]
signifies that 1 is "associated with" 4. 1 is also associated with 12, and 1 associated with 22. Also, 2 is associated with 4, 2 is associated with 5, 2 associated with 13, and 1 associated with 23, etc. 与值
[4, 12, 22]
关联的键1
表示1与“关联”4。1也与12关联,而1与22关联。此外,2与4关联,2与5关联。 ,2与13关联和1与23关联等等。
My question is, for this small example, how do I "unfold" this dictionary such that each element of the value list encodes this "association"? 对于这个小例子,我的问题是,我如何“展开”这个字典,以便值列表的每个元素都对这个“关联”进行编码?
That is, the end result should be: 也就是说,最终结果应为:
intended_dict = {1:[4, 12, 22], 2:[4, 5, 13, 23], 3:[7, 15, 25],
4:[1, 2], 5:[2], 12:[1], 13:[2], 15:[3], 22:[1], 23:[2], 25:[3]}
because 4 is associated with 1, 4 is associated with 2, 5 is associate with 2, etc. 因为4与1关联,4与2关联,5与2关联,依此类推。
Is there a method to "unfold" dictionaries like this? 有没有办法像这样“展开”字典?
How would this scale to a far larger dictionary with larger lists with millions of integers? 如何将其扩展为具有更大列表且包含数百万个整数的更大词典?
Perhaps another data structure would be more efficient here, especially with far larger lists? 也许其他数据结构在这里会更有效,尤其是对于更大的列表?
EDIT: Given the size of the actual dictionary I am working with (not the one posted above), the solution should try to be as memory-/performance-efficient as possible. 编辑:给定我正在使用的实际字典的大小(而不是上面发布的字典),该解决方案应尽可能提高内存/性能效率。
The following will do: 将执行以下操作:
intended_dict = dict1.copy()
for k, v in dict1.items():
for i in v:
intended_dict.setdefault(i, []).append(k)
One way is using collections.defaultdict
一种方法是使用
collections.defaultdict
from collections import defaultdict
dict1 = {1:[4, 12, 22], 2:[4, 5, 13, 23], 3:[7, 15, 25]}
d_dict = defaultdict(list)
for k,l in dict1.items():
for v in l:
d_dict[v].append(k)
intended_dict = {**dict1, **d_dict}
print (intended_dict)
#{1: [4, 12, 22], 2: [4, 5, 13, 23], 3: [7, 15, 25], 4: [1, 2], 12: [1], 22: [1], 5: [2], 13: [2], 23: [2], 7: [3], 15: [3], 25: [3]}
Simple one liner: 简单的一个班轮:
newdict={v:[i for i in dict1.keys() if v in dict1[i]] for k,v in dict1.items() for v in v}
print(newdict)
Output: 输出:
{4: [1, 2], 12: [1], 22: [1], 5: [2], 13: [2], 23: [2], 7: [3], 15: [3], 25: [3]}
To merge them: 合并它们:
print({**dict1,**newdict})
You're basically trying to store relations. 您基本上是在尝试存储关系。 There's a whole field about this -- they are stored in relational databases, which contain tables .
有一个完整的领域-它们存储在包含表的关系数据库中。 In Python it would be more natural to do this as a list of 2-lists -- or, as your relation is symmetrical and order doesn't matter, a list of 2-sets.
在Python中,将其作为2个列表的列表来做会更自然-或者,因为您的关系是对称的且顺序无关紧要,所以将2个列表作为一个列表。 An even better solution though is
pandas
which is the canonical package for doing tables in Python. 不过,更好的解决方案是
pandas
,它是在Python中做表格的规范软件包。
For the time being here's how to turn your original thing into a pandas
object, and then turn that into your fixed thing for including the symmetries. 目前,这里是如何将您的原始对象转换为
pandas
对象,然后将其转换为包含对称性的固定对象的方法。
import pandas as pd
dict1 = {1:[4, 12, 22], 2:[4, 5, 13, 23], 3:[7, 15, 25]}
relations = pd.DataFrame(
[[key, value] for key, values in dict1.items() for value in values]
)
print(relations)
Out:
0 1
0 1 4
1 1 12
2 1 22
3 2 4
4 2 5
5 2 13
6 2 23
7 3 7
8 3 15
9 3 25
result = {
**{key: list(values) for key, values in relations.groupby(0)[1]},
**{key: list(values) for key, values in relations.groupby(1)[0]}
}
print(result)
Out:
{1: [4, 12, 22],
2: [4, 5, 13, 23],
3: [7, 15, 25],
4: [1, 2],
5: [2],
7: [3],
12: [1],
13: [2],
15: [3],
22: [1],
23: [2],
25: [3]}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.