简体   繁体   中英

How to split a python dictionary into multiple dictionaries based on values

I have a dictionary of like:

{'A': [0, 2, 5],
 'B': [1],
 'C': [3, 6, 9],
 'D': [4, 7, 10],
 'E': [8, 11, 13],
 'F': [12]}

and I would like to split it into multiple dictionaries (say 2), based on the differences of consecutive elements in the values lists. For example, above:

dict_2 = {'A': [0, 2],
     'E': [11, 13]}

dict_3 = {'A': [2, 5],
     'C': [3, 6, 9],
     'D': [4, 7, 10],
     'E': [8, 11]}

So I compare consecutive values in each list, if the difference (i+1) - i is 2, I put it in dict_2 , if it is 3, I put it in dict_3 . I ignore the ones where there is only one element per list or the differences are other than 2 or 3.

I am trying a rather cumbersome approach:

def construct_dicts(init_dict, no_jumps=[2,3]):
    dct_2, dct_3 = {}, {}
    for key in init_dict.keys():
        for index in range(len(init_dict[key])):
            if init_dict[key][index+1] - init_dict[key][index] = no_jumps[0]:
                dct_2[key] = [index, index + 1]
            elif init_dict[key][index+1] - init_dict[key][index] = no_jumps[1]:
                dct_3[key] = [index, index + 1]

This however is cumbersome and ugly (and does not yet work). Is there a more pythonic way to do this?

Here is a general approach using a nested collections.defaultdict :

def categorize_dicts(dictionary):
    dfd = defaultdict(defaultdict)
    for k, v in d.items():
        for i,j in zip(v, v[1:]):
            dfd[j-i].setdefault(k,[]).extend((i, j))
    return dfd  

Demo:

In [28]: d = {'A': [0, 2, 5],
        ...:  'B': [1],
        ...:  'C': [3, 6, 9],
        ...:  'D': [4, 7, 10],
        ...:  'E': [8, 11, 13],
        ...:  'F': [12, 15, 17],
        ...:  'T': [19]}
        ...:  

In [29]: categorize_dicts(d)
Out[29]: 
defaultdict(collections.defaultdict,
            {2: defaultdict(None, {'A': [0, 2], 'E': [11, 13], 'F': [15, 17]}),
             3: defaultdict(None,
                         {'A': [2, 5],
                          'C': [3, 6, 6, 9],
                          'D': [4, 7, 7, 10],
                          'E': [8, 11],
                          'F': [12, 15]})})

There are a few problem in your code:

  1. for index in range(len(init_dict[key])) combined with index + 1 will eventually result in IndexError .
  2. You place the indices index + 1, index in the new list, instead of the corresponding list items.

This approach aims the address the above problems and to include a few improvements. It works for arbitrary differences and uses another dict to distinguish between them. Values are stored in a set in order to prevent duplicates from subsequent items with similar differences. If this is unintended then a list can be used with an additional check.

from collections import defaultdict

d = {
    'A': [0, 2, 5],
    'B': [1],
    'C': [3, 6, 9],
    'D': [4, 7, 10],
    'E': [8, 11, 13],
    'F': [12]
}
diff = defaultdict(lambda: defaultdict(set))

for k, v in d.items():
    for i, j in zip(v, v[1:]):
        diff[j-i][k] |= {i, j}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM