简体   繁体   中英

How to iterate through an N-level nested dictionary in Python?

I find myself making multilevel dictionaries quite a bit. I always have to write very verbose code to iterate through all the levels of the dictionaries with a lot of temporary variables.

Is there a way to generalize this function to iterate through multiple levels instead of hardcoding in and manually specifying how many levels there are?

def iterate_multilevel_dictionary(d, number_of_levels):
    # How to auto-detect number of levels? 
    # number_of_levels = 0
    if number_of_levels == 1:
        for k1, v1 in d.items():
            yield k1, v1
    if number_of_levels == 2:
        for k1, v1 in d.items():
            for k2, v2 in v1.items():
                yield k1, k2, v2
    if number_of_levels == 3:
        for k1, v1 in d.items():
            for k2, v2 in v1.items():
                for k3, v3 in v2.items():
                    yield k1, k2, k3, v3
                    
# Level 1
d_level1 = {"a":1,"b":2,"c":3}
for items in iterate_multilevel_dictionary(d_level1, number_of_levels=1):
    print(items)
# ('a', 1)
# ('b', 2)
# ('c', 3)

# Level 2
d_level2 = {"group_1":{"a":1}, "group_2":{"b":2,"c":3}}
for items in iterate_multilevel_dictionary(d_level2, number_of_levels=2):
    print(items)
#('group_1', 'a', 1)
#('group_2', 'b', 2)
#('group_2', 'c', 3)

# Level 3
d_level3 = {"collection_1":d_level2}
for items in iterate_multilevel_dictionary(d_level3, number_of_levels=3):
    print(items)
# ('collection_1', 'group_1', 'a', 1)
# ('collection_1', 'group_2', 'b', 2)
# ('collection_1', 'group_2', 'c', 3)

I've written this after I saw @VoNWooDSoN's answer. I turned it into an iterator instead of printing inside the function and a little bit of changes to make it more readable. So see his original answer here.

def flatten(d, base=()):
    for k, v in d.items():
        if isinstance(v, dict):
            yield from flatten(v, base + (k,))
        else:
            yield base + (k, v)

1- yielding instead of printing.

2- isinstance() instead of type so that subclasses of dict can also work. You could also use MutableMapping from typing module instead of dict to make it more generic.

3- IMO , getting (k, v) pairs from .items() is much more readable than k and d[k] .

More generic ?

Do you wanna expand this to even more generic which CAN (not have to, like the solution in the OP) accept the number of depths just in case?

Consider these examples:

d_level1 = {"a": 1, "b": 2, "c": 3}
d_level2 = {"group_1": {"a": 1}, "group_2": {"b": 2, "c": 3}}
d_level3 = {"collection_1": d_level2}

for items in flatten(d_level3):
    print(items)
print('------------------------------')
for items in flatten(d_level3, depth=0):
    print(items)
print('------------------------------')
for items in flatten(d_level3, depth=1):
    print(items)
print('------------------------------')
for items in flatten(d_level3, depth=2):
    print(items)

output:

('collection_1', 'group_1', 'a', 1)
('collection_1', 'group_2', 'b', 2)
('collection_1', 'group_2', 'c', 3)
------------------------------
('collection_1', {'group_1': {'a': 1}, 'group_2': {'b': 2, 'c': 3}})
------------------------------
('collection_1', 'group_1', {'a': 1})
('collection_1', 'group_2', {'b': 2, 'c': 3})
------------------------------
('collection_1', 'group_1', 'a', 1)
('collection_1', 'group_2', 'b', 2)
('collection_1', 'group_2', 'c', 3)

depth=None doesn't consider the depth (still works like you want at the first place). But now by specifying depths from 0 to 2 you can see that we are able to iterate how deep we want. here is the code:

def flatten(d, base=(), depth=None):
    for k, v in d.items():
        if not isinstance(v, dict):
            yield base + (k, v)
        else:
            if depth is None:
                yield from flatten(v, base + (k,))
            else:
                if depth == 0:
                    yield base + (k, v)
                else:
                    yield from flatten(v, base + (k,), depth - 1)

Here's a quick and dirty solution for you:

d_level1 = {"a":1,"b":2,"c":3}
d_level2 = {"group_1":{"a":1}, "group_2":{"b":2,"c":3}}
d_level3 = {"collection_1":d_level2}

def flatten(d_in, base=()):
    for k in d_in:
        if type(d_in[k]) == dict:
            flatten(d_in[k], base+(k,))
        else:
            print(base + (k, d_in[k]))

flatten(d_level1)
# ('a', 1)
# ('b', 2)
# ('c', 3)

flatten(d_level2)
#('group_1', 'a', 1)
#('group_2', 'b', 2)
#('group_2', 'c', 3)

flatten(d_level3)
# ('collection_1', 'group_1', 'a', 1)
# ('collection_1', 'group_2', 'b', 2)
# ('collection_1', 'group_2', 'c', 3)

Be aware!! Python has a recursion limit of about 1000! So, when using recursion in python think very carefully what you're trying to do and be prepared to catch a RuntimeError if you call a recursive function like this.

EDIT: With comments I realized that I'd made a mistake where I did not add the key to the level1 dict output and that I was using a mutable structure as a default argument. I added these and parens in the print statement and reposted. The output now matches the OP's desired output and uses better and modern python.

try out this code

it also supports a combination of levels

from typing import List, Tuple


def iterate_multilevel_dictionary(d: dict):
    dicts_to_iterate: List[Tuple[dict, list]] = [(d, [])]
    '''
    the first item is the dict object and the second object is the prefix keys 
    '''
    while dicts_to_iterate:
        current_dict, suffix = dicts_to_iterate.pop()
        for k, v in current_dict.items():
            if isinstance(v, dict):
                dicts_to_iterate.append((v, suffix + [k]))
            else:
                yield suffix + [k] + [v]


if __name__ == '__main__':
    d_level1 = {"a": 1, "b": 2, "c": 3}
    print(f"test for {d_level1}")
    for items in iterate_multilevel_dictionary(d_level1):
        print(items)
    d_level2 = {"group_1": {"a": 1}, "group_2": {"b": 2, "c": 3}}
    print(f"test for {d_level2}")
    for items in iterate_multilevel_dictionary(d_level2):
        print(items)

    d_level3 = {"collection_1": d_level2}
    print(f"test for {d_level3}")
    for items in iterate_multilevel_dictionary(d_level3):
        print(items)

    d_level123 = {}
    [d_level123.update(i) for i in [d_level1, d_level2, d_level3]]
    print(f"test for {d_level123}")
    for items in iterate_multilevel_dictionary(d_level123):
        print(items)

the outputs is:

test for {'a': 1, 'b': 2, 'c': 3}
['a', 1]
['b', 2]
['c', 3]
test for {'group_1': {'a': 1}, 'group_2': {'b': 2, 'c': 3}}
['group_2', 'b', 2]
['group_2', 'c', 3]
['group_1', 'a', 1]
test for {'collection_1': {'group_1': {'a': 1}, 'group_2': {'b': 2, 'c': 3}}}
['collection_1', 'group_2', 'b', 2]
['collection_1', 'group_2', 'c', 3]
['collection_1', 'group_1', 'a', 1]
test for {'a': 1, 'b': 2, 'c': 3, 'group_1': {'a': 1}, 'group_2': {'b': 2, 'c': 3}, 'collection_1': {'group_1': {'a': 1}, 'group_2': {'b': 2, 'c': 3}}}
['a', 1]
['b', 2]
['c', 3]
['collection_1', 'group_2', 'b', 2]
['collection_1', 'group_2', 'c', 3]
['collection_1', 'group_1', 'a', 1]
['group_2', 'b', 2]
['group_2', 'c', 3]
['group_1', 'a', 1]

using recursion is another approach but I thought writing without recursion is more challenging and more efficient :)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM