How do I turn a nested dict of lists (with dicts) into a single list of dicts?

Question

I would like to convert a pretty complicated dict into a single list with dicts (like a SQL join).

Input:

{
    'company': 'A',
    'employee': [
        {'name': 'John', 'skills': ['python', 'java']},
        {'name': 'Mary', 'skills': ['web', 'databases']}
    ]
}

Output:

[
    {'company': 'A', 'employee': {'name': 'John', 'skills': 'python'}},
    {'company': 'A', 'employee': {'name': 'John', 'skills': 'java'}},
    {'company': 'A', 'employee': {'name': 'Mary', 'skills': 'web'}},
    {'company': 'A', 'employee': {'name': 'Mary', 'skills': 'databases'}},
]

I would like to be able to add more nested lists aswell as adding more companies etc... So hardcoding the levels of dictionaries and lists is not an option. Recursion seems to be my only option.

Does anyone have any pointers how to go around doing this?

For the first level I came up with this, but I'm running into trouble when I start calling the flatten function recursively


def flatten(input):
    output = []
    for item in input:
        if isinstance(input[item], list):
            for subitem in input[item]:
                copy = input
                # Doesnt work
                # copy[item] = flatten(subitem)

                # Only works for the first layer
                copy[item] = subitem
                output.append(copy.copy())
    return output
print(flatten(start))

This code works for the first level (adding a new record for each employee) but calling flatten recursivly makes the output return an empty list

Answer 1

Assuming there's always exactly one value in the input dict that is a list, you can find the key with the list value first, and then return a list of the current dict with that key's value replaced with each of the value recursively flattened, until the the value is no longer a dict:

def flatten(d):
    if isinstance(d, dict):
        key, lst = next((k, v) for k, v in d.items() if isinstance(v, list))
        return [{**d, **{key: v}} for record in lst for v in flatten(record)]
    else:
        return [d]

flatten(d) returns (given the sample input dict as d ):

[{'company': 'A', 'employee': {'name': 'John', 'skills': 'python'}},
 {'company': 'A', 'employee': {'name': 'John', 'skills': 'java'}},
 {'company': 'A', 'employee': {'name': 'Mary', 'skills': 'web'}},
 {'company': 'A', 'employee': {'name': 'Mary', 'skills': 'databases'}}]

Or if there can be a level in the dict where there isn't a value that is a list, in which case the call to the next function with the code above would produce a StopIteration , you can catch that exception and return the input dict as is in a list, and since that's the same behavior when the input isn't a dict, you can simply catch the AttributeError exception (produced when the given object has no items attribute) too to handle both scenarios with the same handler code:

def flatten(d):
    try:
        key, lst = next((k, v) for k, v in d.items() if isinstance(v, list))
    except (StopIteration, AttributeError):
        return [d]
    return [{**d, **{key: v}} for record in lst for v in flatten(record)]

How do I turn a nested dict of lists (with dicts) into a single list of dicts?

Question

1 answers

solution1
1 ACCPTED 2021-06-02 08:42:32

How do I turn a nested dict of lists (with dicts) into a single list of dicts?

Question

1 answers

solution1 1 ACCPTED 2021-06-02 08:42:32

solution1
1 ACCPTED 2021-06-02 08:42:32