简体   繁体   中英

making a dictionary from two lists using indices

I have two lists:

alist =  [11,12,13,11,15]
blist = ['A', 'A', 'B', 'A', 'B']

I want to make a dictionary where items in blist are keys and items in alist are values with lists corresponding to indices in the two lists:

the outcome should be:

{'A': [11, 12, 11], 'B': [13, 15]}

I have tried this:

dictNames = {}
for i in xrange(len(alist)):
    for letter in blist:
        if letter not in dictNames:
            dictNames[letter] = []
        else:
            dictNames[letter].append(alist[i])

which gives the outcome:

{'A': [11, 11, 12, 12, 12, 13, 13, 13, 11, 11, 11, 15, 15, 15], 'B': [11, 12, 12, 13, 13, 11, 11, 15, 15]}

Why does it not append to the pre-existing letter in the dictionary instead of adding to it when it is already in the dictionary?

Use a defaultdict for ease:

from collections import defaultdict

dictNames = defaultdict(list)
for key, value in zip(blist, alist):
    dictNames[key].append(value)

This creates:

>>> dictNames
defaultdict(<type 'list'>, {'A': [11, 12, 11], 'B': [13, 15]})

defaultdict is a subclass of dict so it'll still work just like any other dict .

Without defaultdict you'll have to test if the key is already present with setdefault() :

dictNames = {}
for key, value in zip(blist, alist):
    dictNames.setdefault(key, []).append(value)

resulting in:

>>> dictNames
{'A': [11, 12, 11], 'B': [13, 15]}

The real trick here is using zip() to combine your key and value lists instead of your double loops.

First, you loop over both lists. For every item in alist, it loops through blist. So the inner loop runs 25 times. Instead, you want it to run 5 times, so you want only one loop.

Second, you correctly initialize the list if it does not yet exist, but in that case the number is not added to the list. The number should always be added to the list, even if it is a new list.

I changed your code to take these two things into account, and it works a little better:

for i in xrange(len(alist)):
    letter = blist[i]
    if letter not in dictNames:
        dictNames[letter] = []
    dictNames[letter].append(alist[i])

Output:

{'A': [11, 12, 11], 'B': [13, 15]}

This way preserves order

from collections import defaultdict

alist =  [11,12,13,11,15]
blist = ['A', 'A', 'B', 'A', 'B']

d = defaultdict(list)
seen = defaultdict(set)

for k, v in zip(blist, alist):
    if v not in seen[k]:
        d[k].append(v)
        seen[k].add(v)

print d

defaultdict(<type 'list'>, {'A': [11, 12], 'B': [13, 15]})

Here is a one-line solution:

 {k: [alist[i] for i in range(len(blist)) if blist[i] == k] for k in set(blist)}

The only problem is that the time complexity is O(n^2) in worst case, inadequate for large lists.

This is the shortest expression I can come up with currently:

from itertools import groupby

{k: {x[1] for x in v} for k, v in groupby(sorted(zip(blist, alist)), lambda x: x[0])}

The relevant (and not yet mentioned) part is the call to groupby , also described in the following similar question: Python group by

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM