简体   繁体   中英

How do I turn a nested dict of lists (with dicts) into a single list of dicts?

I would like to convert a pretty complicated dict into a single list with dicts (like a SQL join).

Input:

{
    'company': 'A',
    'employee': [
        {'name': 'John', 'skills': ['python', 'java']},
        {'name': 'Mary', 'skills': ['web', 'databases']}
    ]
}

Output:

[
    {'company': 'A', 'employee': {'name': 'John', 'skills': 'python'}},
    {'company': 'A', 'employee': {'name': 'John', 'skills': 'java'}},
    {'company': 'A', 'employee': {'name': 'Mary', 'skills': 'web'}},
    {'company': 'A', 'employee': {'name': 'Mary', 'skills': 'databases'}},
]

I would like to be able to add more nested lists aswell as adding more companies etc... So hardcoding the levels of dictionaries and lists is not an option. Recursion seems to be my only option.

Does anyone have any pointers how to go around doing this?

For the first level I came up with this, but I'm running into trouble when I start calling the flatten function recursively


def flatten(input):
    output = []
    for item in input:
        if isinstance(input[item], list):
            for subitem in input[item]:
                copy = input
                # Doesnt work
                # copy[item] = flatten(subitem)

                # Only works for the first layer
                copy[item] = subitem
                output.append(copy.copy())
    return output
print(flatten(start))

This code works for the first level (adding a new record for each employee) but calling flatten recursivly makes the output return an empty list

Assuming there's always exactly one value in the input dict that is a list, you can find the key with the list value first, and then return a list of the current dict with that key's value replaced with each of the value recursively flattened, until the the value is no longer a dict:

def flatten(d):
    if isinstance(d, dict):
        key, lst = next((k, v) for k, v in d.items() if isinstance(v, list))
        return [{**d, **{key: v}} for record in lst for v in flatten(record)]
    else:
        return [d]

flatten(d) returns (given the sample input dict as d ):

[{'company': 'A', 'employee': {'name': 'John', 'skills': 'python'}},
 {'company': 'A', 'employee': {'name': 'John', 'skills': 'java'}},
 {'company': 'A', 'employee': {'name': 'Mary', 'skills': 'web'}},
 {'company': 'A', 'employee': {'name': 'Mary', 'skills': 'databases'}}]

Or if there can be a level in the dict where there isn't a value that is a list, in which case the call to the next function with the code above would produce a StopIteration , you can catch that exception and return the input dict as is in a list, and since that's the same behavior when the input isn't a dict, you can simply catch the AttributeError exception (produced when the given object has no items attribute) too to handle both scenarios with the same handler code:

def flatten(d):
    try:
        key, lst = next((k, v) for k, v in d.items() if isinstance(v, list))
    except (StopIteration, AttributeError):
        return [d]
    return [{**d, **{key: v}} for record in lst for v in flatten(record)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM