[英]Get maximum value in dictionary for a given tuple key where elements of the tuple may be in different positions
I have the following dictionary:我有以下字典:
{('2019-07-243541760601284691', '2019-07-243541760603812661'): 1086,
('2019-07-243541760601314711', '2019-07-243541760603996721'): 662,
('2019-07-243541760603794841', '2019-07-243541760600899921'): 483,
('2019-07-243541760603794841', '2019-07-243541760601224211'): 70,
('2019-07-243541760603794841', '2019-07-243541760607368321'): 54,
('2019-07-243541760600899921', '2019-07-243541760601224211'): 93,
('2019-07-243541760600899921', '2019-07-243541760607368321'): 74,
('2019-07-243541760601224211', '2019-07-243541760607368321'): 490,
('2019-07-243541760613553761', '2019-07-243541760602348611'): 484,
('2019-07-243541760602450401', '2019-07-243541760602927941'): 1118,
('2019-07-243541760603292161', '2019-07-243541760606108621'): 732}
The keys are tuples of unique ids.键是唯一 ID 的元组。 As you can see certain ids are repeated within both columns (eg. lines 3 to 5 have the same first id, and element 2 from line 3 is also present in lines 6 and 7).
如您所见,某些 id 在两列中重复(例如,第 3 行到第 5 行具有相同的第一个 id,第 3 行中的元素 2 也出现在第 6 行和第 7 行中)。 I want to find a way to return the maximum value for each given unique id.
我想找到一种方法来为每个给定的唯一 id 返回最大值。
If it helps the output I am looking for would look like this:如果它有助于我正在寻找的输出看起来像这样:
{('2019-07-243541760601284691', '2019-07-243541760603812661'): 1086,
('2019-07-243541760601314711', '2019-07-243541760603996721'): 662,
('2019-07-243541760603794841', '2019-07-243541760600899921'): 483,
('2019-07-243541760601224211', '2019-07-243541760607368321'): 490,
('2019-07-243541760613553761', '2019-07-243541760602348611'): 484,
('2019-07-243541760602450401', '2019-07-243541760602927941'): 1118,
('2019-07-243541760603292161', '2019-07-243541760606108621'): 732}
In other words each unique id should appear at most in one key in the dictionary.换句话说,每个唯一的 id最多应该出现在字典中的一个键中。
You can change the behavior of the built in types in Python.您可以更改 Python 中内置类型的行为。 For your case it's really easy to create a dict subclass that will store duplicated values in lists under the same key automatically.
对于您的情况,创建一个 dict 子类非常容易,该子类将自动在同一键下的列表中存储重复值。 You can then get the max value of the list.
然后您可以获得列表的最大值。
class Dictlist(dict):
def __setitem__(self, key, value):
try:
self[key]
except KeyError:
super(Dictlist, self).__setitem__(key, [])
self[key].append(value)
Output example:输出示例:
>>> d = dictlist.Dictlist()
>>> d['test'] = 1
>>> d['test'] = 2
>>> d['test'] = 3
>>> d
{'test': [1, 2, 3]}
>>> d['other'] = 100
>>> d
{'test': [1, 2, 3], 'other': [100]}
So when you want to return the max value of a key, just pass the key as the parameters and this will give you the max value in that list.因此,当您想返回某个键的最大值时,只需将该键作为参数传递,这将为您提供该列表中的最大值。
max(dict[key_value])
First I sort
by values in a list L
.首先,我按列表
L
值sort
。 Then I iterate over L
, and I store each new tuple's id in a set ids_set
, and if they were not in this set yet, then I put them into my dict final
.然后我遍历
L
,并将每个ids_set
组的 id 存储在一个集合ids_set
,如果它们还不在这个集合中,那么我将它们放入我的 dict final
。
d = {('2019-07-243541760601284691', '2019-07-243541760603812661'): 1086,
('2019-07-243541760601314711', '2019-07-243541760603996721'): 662,
('2019-07-243541760603794841', '2019-07-243541760600899921'): 483,
('2019-07-243541760603794841', '2019-07-243541760601224211'): 70,
('2019-07-243541760603794841', '2019-07-243541760607368321'): 54,
('2019-07-243541760600899921', '2019-07-243541760601224211'): 93,
('2019-07-243541760600899921', '2019-07-243541760607368321'): 74,
('2019-07-243541760601224211', '2019-07-243541760607368321'): 490,
('2019-07-243541760613553761', '2019-07-243541760602348611'): 484,
('2019-07-243541760602450401', '2019-07-243541760602927941'): 1118,
('2019-07-243541760603292161', '2019-07-243541760606108621'): 732}
L = sorted(d.items(), key=lambda item: item[1], reverse=True)
# L = [(('2019-07-243541760602450401', '2019-07-243541760602927941'), 1118),...
final = {}
ids_set = set()
for i in L:
t = i[0] # tuple
val = i[1]
if (t[0] or t[1]) not in ids_set:
ids_set.update(t)
final[t] = val
for i, v in final.items():
print(i, v)
Output :输出 :
('2019-07-243541760602450401', '2019-07-243541760602927941') 1118
('2019-07-243541760601284691', '2019-07-243541760603812661') 1086
('2019-07-243541760603292161', '2019-07-243541760606108621') 732
('2019-07-243541760601314711', '2019-07-243541760603996721') 662
('2019-07-243541760601224211', '2019-07-243541760607368321') 490
('2019-07-243541760613553761', '2019-07-243541760602348611') 484
('2019-07-243541760603794841', '2019-07-243541760600899921') 483
d = {('2019-07-243541760601284691', '2019-07-243541760603812661'): 1086,
('2019-07-243541760601314711', '2019-07-243541760603996721'): 662,
('2019-07-243541760603794841', '2019-07-243541760600899921'): 483,
('2019-07-243541760603794841', '2019-07-243541760601224211'): 70,
('2019-07-243541760603794841', '2019-07-243541760607368321'): 54,
('2019-07-243541760600899921', '2019-07-243541760601224211'): 93,
('2019-07-243541760600899921', '2019-07-243541760607368321'): 74,
('2019-07-243541760601224211', '2019-07-243541760607368321'): 490,
('2019-07-243541760613553761', '2019-07-243541760602348611'): 484,
('2019-07-243541760602450401', '2019-07-243541760602927941'): 1118,
('2019-07-243541760603292161', '2019-07-243541760606108621'): 732}
# flatten the dict so that each id in the tuple has value
# but first make sure we sort on values so that if the key appears twice
# the latest element should be the one with max value (dict only keeps one item per key)
sorted_dict = dict(item for item in sorted(d.items(), key=lambda x: x[-1]))
flat_dict = {**{k[0]:v for k,v in d.items()}, **{k[1]:v for k,v in sorted_dict .items()}}
# get the unique keys you want
keys, already_used = [], []
for k in d:
if k[0] in already_used or k[1] in already_used:
continue
keys.append(k)
already_used.extend(k)
# now create the new_dict
new_dict = {k:max(flat_dict[k[0]], flat_dict[k[1]]) for k in keys}
Output输出
>>> new_dict
{('2019-07-243541760601284691', '2019-07-243541760603812661'): 1086,
('2019-07-243541760601314711', '2019-07-243541760603996721'): 662,
('2019-07-243541760603794841', '2019-07-243541760600899921'): 483,
('2019-07-243541760601224211', '2019-07-243541760607368321'): 490,
('2019-07-243541760613553761', '2019-07-243541760602348611'): 484,
('2019-07-243541760602450401', '2019-07-243541760602927941'): 1118,
('2019-07-243541760603292161', '2019-07-243541760606108621'): 732}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.