[英]How to handle multiple keys for a dictionary in python?
I've been searching about how to go for adding multiple val for single keys in a Dict if a duplicate key is found. 如果找到重复的密钥,我一直在搜索如何在Dict中为单个密钥添加多个val。
Let's take an example: 我们来举个例子:
list_1 = ['4', '6' ,'8', '8']
list_2 = ['a', 'b', 'c', 'd']
new_dict = dict(zip(list_1,list_2))
...output...
{'8': 'd', '4': 'a', '6': 'b'}
Expected output : 预期产量:
{'8': 'c,d', '4': 'a', '6': 'b'}
In order to process the above two list and combine them into one dict, i would face a certain challenge that we can't have two 8's in the 'key' of dict, which is a default behavior and i understand why !! 为了处理上面两个列表并将它们组合成一个字典,我将面临一个特定的挑战,我们不能在dict的'key'中有两个8,这是默认行为,我理解为什么!
Some of the options that exists to process such scenario are : 处理此类方案的一些选项包括:
1) Find if 'key' already exists in dict, if yes, then append the new val to 'key' 1)查找dict中是否已存在'key',如果是,则将新val附加到'key'
2) Create a mutable object to reference each key and in that way you can have multiple dup keys ~~Not really my use case 2)创建一个可变对象来引用每个键,这样你就可以有多个双键~~不是我的用例
So, how can i go about for expected output using option#1 ? 那么,我如何使用选项#1进行预期输出?
defaultdict
/ dict.setdefault
defaultdict
/ dict.setdefault
Let's jump into it: 让我们跳进去吧:
from collections import defaultdict
d = defaultdict(list)
for i, j in zip(list_1, list_2):
d[i].append(j)
The defaultdict
makes things simple, and is efficient with appending. defaultdict
使事情变得简单,并且附加效率很高。 If you don't want to use a defaultdict
, use dict.setdefault
instead (but this is a bit more inefficient): 如果您不想使用
defaultdict
,请改用dict.setdefault
(但这样效率会更低):
d = {}
for i, j in zip(list_1, list_2):
d.setdefault(i, []).append(j)
new_dict = {k : ','.join(v) for k, v in d.items()})
print(new_dict)
{'4': 'a', '6': 'b', '8': 'c,d'}
DataFrame.groupby
+ agg
DataFrame.groupby
+ agg
If you want performance at high volumes, try using pandas: 如果您想要高容量的性能,请尝试使用pandas:
import pandas as pd
df = pd.DataFrame({'A' : list_1, 'B' : list_2})
new_dict = df.groupby('A').B.agg(','.join).to_dict()
print(new_dict)
{'4': 'a', '6': 'b', '8': 'c,d'}
You can do it with a for
loop that iterates over the two lists: 您可以使用遍历两个列表的
for
循环来执行此操作:
list_1 = ['4', '6' ,'8', '8']
list_2 = ['a', 'b', 'c', 'd']
new_dict = {}
for k, v in zip(list_1, list_2):
if k in new_dict:
new_dict[k] += ', ' + v
else:
new_dict[k] = v
There might be efficiency problems for huge dictionaries, but it will work just fine in simple cases. 对于庞大的词典可能存在效率问题,但在简单的情况下它可以正常工作。
Thanks to @Ev. 感谢@Ev。 Kounis and @bruno desthuilliers that pointed out a few improvements to the original answer.
Kounis和@bruno desthuilliers指出了对原始答案的一些改进。
coldspeed's answer is more efficient than mine, I keep this one here because it is still correct and I don't see the point in deleting it. coldspeed的回答比我的回答更有效率,我把它保留在这里,因为它仍然是正确的,我没有看到删除它的重点。
Try using setdefault
dictionary function and get the index of it, then use try and except for checking if idx
exists or not, i didn't get the index of the element every time because there are duplicates and at the end i format it so it outputs like Your desired output: 尝试使用
setdefault
字典函数并获取它的索引,然后使用try和除了检查idx
存在,我没有每次都得到元素的索引,因为有重复,最后我格式化它所以它输出如您所需的输出:
new_dict = {}
list_1 = ['4', '6' ,'8', '8']
list_2 = ['a', 'b', 'c', 'd']
for i in list_1:
try:
idx+=1
except:
idx = list_1.index(i)
new_dict.setdefault(i, []).append(list_2[idx])
print({k:', '.join(v) for k,v in new_dict.items()})
Output: 输出:
{'4': 'a', '6': 'b', '8': 'c, d'}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.