简体   繁体   中英

Python Sum Second Element of Tuple by First Element

I have the following two lists in Python3:

list1 = [(3, 2), (1, 5), (4, 2), (2, 0)]
list2 = [(1, 6), (2, 1), (3, 3), (4, 0)]

Each tuple in the list is of the form (id, score). I want to combine the scores from the two list for each id separately, and then sort by descending scores. So, the output should be:

list3 = [(1, 11), (3, 5), (4, 2), (2, 1)]

What's the easiest way to do this?

Use groupby to group by id, find the sum of their scores and sort by decreasing order of scores:

from itertools import groupby

list1 = [(3, 2), (1, 5), (4, 2), (2, 0)]
list2 = [(1, 6), (2, 1), (3, 3), (4, 0)]

f = lambda x: x[0]
list3 = sorted([(k, sum(x[1] for x in g)) for k, g in groupby(sorted(list1 + list2, key=f), key=f)], key=lambda x: x[1], reverse=True)

# [(1, 11), (3, 5), (4, 2), (2, 1)]

A simpler solution, without any external dependancies may be:

list1 = [(3, 2), (1, 5), (4, 2), (2, 0)]
list2 = [(1, 6), (2, 1), (3, 3), (4, 0)]

list1.sort()
list2.sort()

list3 = [(list1[i][0], list1[i][1] + list2[i][1]) for i in range(len(list1))]
list3.sort(key=lambda x: x[1], reverse=True)

print(list3)

#[(1, 11), (3, 5), (4, 2), (2, 1)]

Though this assumes a few things though:

  • You have no duplicate IDs
  • You always have the ID in the [0] index and score in [1]
  • Your two lists are always the same shape
  • You only have two lists
  • You have no missing IDs (ie IDs of 1-4 in List1 and IDs 2-5 in List2)

Maybe that's ok for this case though? Depends on how broadly applicable you want this to be.

One way is to turn at least one of the lists to a dictionary:

d = dict(list2)
print(sorted(((k, v+d[k]) for k, v in list1), 
             key=lambda a: a[1], reverse=True))

For a more general case where you might have more than 2 lists and where not every id necessarily appears in every list, this can become:

dicts = [dict(L) for L in [list1, list2]]
keys = set().union(*dicts)   # get all of the IDs
sums = ((k, sum(d.get(k, 0) for d in dicts)) for k in keys)
print(sorted(sums, key=lambda a: a[1], reverse=True))

or more readably

sums = {}
for scores in [list1, list2]:
    for id_, score in scores:`
        if id_ in sums:
            sums[id_] += score
        else:
            sums[id_] = score
 print(sorted(sums, key=lambda a: a[1], reverse=True))

Perhaps the nicest option is to use Counter which has a most_common method to do the sorting you want.

from collections import Counter
print((Counter(dict(list1)) + Counter(dict(list2))).most_common())

Another solution using dict comprehension :

sorted({id: dict(list1).get(id, 0) + dict(list2).get(id, 0) for id in dict(list1)}.items(), reverse=True, key=lambda x: x[1])
[(1, 11), (3, 5), (4, 2), (2, 1)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM