简体   繁体   中英

How to recategorize a list of tuples by the first element in Python?

Using Python3.x, I have a list of tuples as follows (whereby the first element is either an integer or string):

tuple_list = [(1, 'AA', 515), (1, 'BBT', 101), 
                  (1, 'CZF', 20), (2, 'TYZ', 8341), (2, 'ONR', 11)]

In this example, some of the tuples begin with 1 and others with 2 . Each has been put into a separate list.

I would like a way to "categorize" the tuples with the same first element into separate lists.

The desired solution in this case is the following, a list of lists:

[[(1, 'AA', 515), (1, 'BBT', 101), (1, 'CZF', 20)], 
        [(2, 'TYZ', 8341), (2, 'ONR', 11)]]

This is possible to do with by iterating and checking whether a list exists for each (unique) first element, but this will be computationally expensive for larger lists with more "unique" first elements than simply 1 and 2 .

How would one do this to be quick/efficient?

Use itertools.groupby . Paired with operator.itemgetter for efficient lookups/slicing.

from itertools import groupby
from operator import itemgetter

tuple_list = [(1, 'AA', 515), (1, 'BBT', 101), (1, 'CZF', 20), (2, 'TYZ', 8341), (2, 'ONR', 11)]

get_first = itemgetter(0)
result = [list(g) for k, g in groupby(sorted(tuple_list, key=get_first), get_first)]

Result:

[[(1, 'AA', 515), (1, 'BBT', 101), (1, 'CZF', 20)], [(2, 'TYZ', 8341), (2, 'ONR', 11)]]

Or use collections.defaultdict

from collections import defaultdict

d = defaultdict(list)

for t in tuple_list:
    d[t[0]].append(t)

result = list(d.values())

Result:

[[(1, 'AA', 515), (1, 'BBT', 101), (1, 'CZF', 20)], [(2, 'TYZ', 8341), (2, 'ONR', 11)]]

One way is using a defaultdict and store the first element as index, and then group them, like this:

from collections import defaultdict

tuple_list = [(1, 'AA', 515), (1, 'BBT', 101),
                  (1, 'CZF', 20), (2, 'TYZ', 8341), (2, 'ONR', 11)]

dct = defaultdict(list)
for l in tuple_list:
    dct[l[0]].append(l)

print(sorted(dct.values(), key=lambda l: l[0][0]))

>>> [[(1, 'AA', 515), (1, 'BBT', 101), (1, 'CZF', 20)], [(2, 'TYZ', 8341), (2, 'ONR', 11)]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM