I have a list of tuples that can be understood as key-value pairs, where a key can appear several times, possibly with different values, for example
[(2,8),(5,10),(2,5),(3,4),(5,50)]
I now want to get a list of tuples with the highest value for each key, ie
[(2,8),(3,4),(5,50)]
The order of the keys is irrelevant.
How do I do that in an efficient way?
Sort them and then cast to a dictionary and take the items again from it:
l = [(2,8),(5,10),(2,5),(3,4),(5,50)]
list(dict(sorted(l)).items()) #python3, if python2 list cast is not needed
[(2, 8), (3, 4), (5, 50)]
The idea is that the key-value pairs will get updated in ascending order when transforming to a dictionary filtering the lowest values for each key, then you just have to take it as tuples.
At its core, this problem is essentially about grouping the tuples based on their first element and then keeping only the maximum of each group.
Grouping can be done easily with a defaultdict
. A detailed explanation of grouping with defaultdicts can be found in my answer here . In your case, we group the tuples by their first element and then use the max
function to find the tuple with the largest number.
import collections
tuples = [(2,8),(5,10),(2,5),(3,4),(5,50)]
groupdict = collections.defaultdict(list)
for tup in tuples:
group = tup[0]
groupdict[group].append(tup)
result = [max(group) for group in groupdict.values()]
# result: [(2, 8), (5, 50), (3, 4)]
In your particular case, we can optimize the code a little bit by storing only the maximum 2nd element in the dict, rather than storing a list of all tuples and finding the maximum at the end:
tuples = [(2,8),(5,10),(2,5),(3,4),(5,50)]
groupdict = {}
for tup in tuples:
group, value = tup
if group in groupdict:
groupdict[group] = max(groupdict[group], value)
else:
groupdict[group] = value
result = [(group, value) for group, value in groupdict.items()]
This keeps the memory footprint to a minimum, but only works for tuples with exactly 2 elements.
This has a number of advantages over Netwave's solution :
max
function makes it easy to understand which tuples are kept. Netwave's one-liner is clever, but clever solutions are rarely easy to read.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.