简体   繁体   中英

how to calculate the sum of list's lengths of each key in a nested dictionary?

I want to calculate the sum of the list length of each key in a nested dictionary. below one is the example dictionary. For instance, key 'a' and 'b' contains two 'x' key lengths are 3 and 4 respectively. so the sum of the lists length for key 'x' is {'x': 7} . likewise it should be {'x': 7, 'y': 9, 'z': 12}

dict_exmple = {'a':
               {
                   'x': ['f1', 'f2', 'f6'],
                   'y': ['f6', 'f9', 'f2', 'f8'],
                   'z': ['f1', 'f9', 'f2', 'f8', 'f6', 'f10', 'f3']
               },
               'b': {
                   'x': ['f1', 'f2', 'f6', 'f4'],
                   'y': ['f6', 'f9', 'f2', 'f8', 'f17'],
                   'z': ['f1', 'f9', 'f2', 'f10', 'f3']
               }
               }
counts = {}
for key1,value1 in dict_exmple.items():
    count = 0 
    for key, value in value1.items():
        counts[key] = 0
        if counts[key] >=0: 
            counts[key] = counts[key]+len(value)
            print(counts[key])
    count = counts[key]

The above code is giving the output like {'x': 4, 'y': 5, 'z': 5} but it should be like this {'x': 7, 'y': 9, 'z': 12} .

Use collections.Counter()

It automatically initializes unknown keys with 0 instead of throwing KeyError. Your code is giving you incorrect values because of counts[key] = 0 - you always set it to 0 even if the key already exists. Which also means if counts[key] >=0: is pointless as you set it 0 literally just a statement earlier.

from collections import Counter

counts = Counter()
for key1, value1 in dict_exmple.items():
    for key, value in value1.items():
        counts[key] += len(value)
        print(counts[key])

Assuming that both subdictionaries have the same keys (so no checks about it):

dict_example = {
    'a': {
        'x': ['f1', 'f2', 'f6'],
        'y': ['f6', 'f9', 'f2', 'f8'],
        'z': ['f1', 'f9', 'f2', 'f8', 'f6', 'f10', 'f3']
    },
    'b': {
        'x': ['f1', 'f2', 'f6', 'f4'],
        'y': ['f6', 'f9', 'f2', 'f8', 'f17'],
        'z': ['f1', 'f9', 'f2', 'f10', 'f3']
    }
}

counts = {key: sum(len(dict_example[char][key]) for char in dict_example)
          for key in dict_example["a"]}

UPDATE:

And to compute the sum of lengths per subdictionary:

counts2 = {key: sum(len(item) for item in subdict.values())
           for key, subdict in dict_example.items()}

A recursive way. No matter how many nested dictionary you have. But I assume if there is a dictionary or list

dict_exmple = {'a':
               {
                   'x': ['f1', 'f2', 'f6'],
                   'y': ['f6', 'f9', 'f2', 'f8'],
                   'z': ['f1', 'f9', 'f2', 'f8', 'f6', 'f10', 'f3']
               },
               'b': {
                   'x': ['f1', 'f2', 'f6', 'f4'],
                   'y': ['f6', 'f9', 'f2', 'f8', 'f17'],
                   'z': ['f1', 'f9', 'f2', 'f10', 'f3'],
                   'p': {'z': [1, 2, 3]}
               }
               }

from collections import defaultdict
def cla_length(dict_data, pr_dict=defaultdict(0)):
    for i, j in dict_data.items():
        if isinstance(j, list):
            pr_dict[i] += len(j)
        else:
            pr_dict = cla_length(j, pr_dict)
    return pr_dict


dict(cla_length(dict_exmple))

You can do this in a more generic way using recursion as follows:

dict_exmple = {'a':
               {
                   'x': ['f1', 'f2', 'f6'],
                   'y': ['f6', 'f9', 'f2', 'f8'],
                   'z': ['f1', 'f9', 'f2', 'f8', 'f6', 'f10', 'f3']
               },
               'b': {
                   'x': ['f1', 'f2', 'f6', 'f4'],
                   'y': ['f6', 'f9', 'f2', 'f8', 'f17'],
                   'z': ['f1', 'f9', 'f2', 'f10', 'f3']
               }
               }
result = dict()

def parsedict(d):
    for k, v in d.items():
        if isinstance(v, dict):
            parsedict(v)
        else:
            if isinstance(v, list):
                result[k] = result.get(k, 0) + len(v)

parsedict(dict_exmple)

print(result)

Output:

{'x': 7, 'y': 9, 'z': 12}

The Counter class from collections can make this relatively easy to implement as a recursive function for any depth of nesting:

from collections import Counter

def keySum(D,key=None):
    if isinstance(D,list):                  # parent key has a list
        return {key:len(D)}                 # return it with length
    result = Counter()
    for k,v in D.items():                   # for each key
          result += keySum(v,k)             # add sub Keys
    if key is not None:                     # for parent key
        result[key] += sum(result.values()) # sum of sub Keys
    return result 

print(keySum(dict_exmple))
# Counter({'a': 14, 'b': 14, 'z': 12, 'y': 9, 'x': 7})            

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM