简体   繁体   中英

Match the second unique item to the first item which is repetive in a Python list

I am looping over a list which produces lists that contain two items, for example;

['string1', '1234567']
['string1', '1234576']
['string1', '1234765']
['string2', '7654321']
['string2', '7654123']

The first item in the list can be repetitive, the second item in the list is always unique. I want to restructure the list so the following output is produced. I want to have the first items in the list to be unique with the corresponding second items. The desired output;

['string1', ['1234567', '1234576','1234765']]
['string2', ['7654321','7654123']]

Is it useful to generate a new list of the second items in the list, then create a new list to get the unique strings from the first items? Then compare the two lists and map them in some way...I really have no idea. I don't know if there is some kind of Python functionality for this?

Since the data is sorted, you can use itertools.groupby :

from itertools import groupby

l = [['string1', '1234567'],
     ['string1', '1234576'],
     ['string1', '1234765'],
     ['string2', '7654321'],
     ['string2', '7654123']]

l2 = [[k, [x[1] for x in g]] for k, g in groupby(l, key=lambda x: x[0])]
# [['string1', ['1234567', '1234576', '1234765']],
#  ['string2', ['7654321', '7654123']]]

If the data weren't sorted, you could use a collections.defaultdict to collect all the second elements for each first. This is essentially the same approach that mshsayem chose in his answer where he uses a vanilla dict and setdefault :

from collections import defaultdict

d = defaultdict(list)
for x, y in l:
    d[x].append(y)
l2 = d.items()
# [('string2', ['7654321', '7654123']), 
#  ('string1', ['1234567', '1234576', '1234765'])]

Here is a way:

>>> l = [['string1', '1234567']
,['string1', '1234576']
,['string1', '1234765']
,['string2', '7654321']
,['string2', '7654123']]
>>> result = {}
>>> for li in l:
        result.setdefault(li[0],[]).append(li[1])


>>> result
{'string2': ['7654321', '7654123'], 'string1': ['1234567', '1234576', '1234765']}

If you want list of list (as your question) you can do this:

>>> map(list,result.items())
[['string2', ['7654321', '7654123']], ['string1', ['1234567', '1234576', '1234765']]]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM