简体   繁体   中英

Convert a nested dictionary with lists of dictionaries to a Pandas DataFrame

I have a dictionary like this:

{
    'person1' : {
        'category1' : [
            {'time' : time1, 'property1' : value1, 'property2' : value2, ...},
            {'time' : time2, 'property1' : value3, 'property2' : value4, ...},
            ...
        ],
        'category2' : [
            {...}, 
            ...
        ],
        ...
    },
    'person2' : {
        'category1' : [
            {'time' : time3, 'property1' : value5, 'property2' : value6, ...},
            {'time' : time4, 'property1' : value7, 'property2' : value8, ...},
        ],
        'category2' : [
            {...},
            ...
        ],
        ...
    }
}

that I'd like to convert to a DataFrame like this:

'time'    'person'    'category'    'property1'    'property2'  ...  'propertyn'
--------------------------------------------------------------------------------
 time1     person1     category1      value1         value2     ...     value
 time2     person1     category1      value3         value4     ...     value
  ...      person1     category2       ...            ...       ...      ...
                                        .
                                        .
                                        .
 time3     person2     category1      value5         value6     ...     value
 time4     person2     category1      value7         value8     ...     value
  ...      person2     category2       ...            ...       ...      ...
                                        .
                                        .
                                        .
 timen     personn     category1      valuen         valuen     ...     value

I thought about using traversing through the dictionary with for-loops, but I'd like to avoid them to make this conversion as efficient as possible. I've also looked into pd.DataFrame.from_dict() , but it's not enough.

This answer is the closest I've seen, but their inner lists are just lists of values instead of a list of dictionaries like mine.

I would appreciate any help with this conversion!

You can create a small dataframe for each person/category and concat them:

pd.concat([pd.DataFrame(u).assign(category=c,person=p) for p,data in d.items() for c,u in data.items()])

Output:

    time property1 property2   category   person
0  time1    value1    value2  category1  person1
1  time2    value3    value4  category1  person1
0  time3    value5    value6  category1  person2
1  time4    value7    value8  category1  person2
0  time3    value5    value6  category2  person2
1  time4    value7    value8  category2  person2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM