简体   繁体   中英

Construct MultiIndex pandas DataFrame nested Python dictionary

I would like to construct a MultiIndex DataFrame from a deeply-nested dictionary of the form

md = {'50': {'100': {'col1': ('0.100',
                              '0.200',
                              '0.300',
                              '0.400'),
                     'col2': ('6.263E-03',
                              '6.746E-03',
                              '7.266E-03',
                              '7.825E-03')},
             '101': {'col1': ('0.100',
                              '0.200',
                              '0.300',
                              '0.400'),
                     'col2': ('6.510E-03',
                              '7.011E-03',
                              '7.553E-03',
                              '8.134E-03')}
             '102': ...
            }
      '51': ...
     }

I've tried

df = pd.DataFrame.from_dict({(i,j): md[i][j][v] for i in md.keys() for j in md[i].keys() for v in md[i][j]}, orient='index')

following Construct pandas DataFrame from items in nested dictionary , but I get a DataFrame with 1 row and many columns.

Bonus: I'd also like to label the MultiIndex keys and the columns 'col1' and 'col2', as well as convert the strings to int and float , respectively.

How can I reconstruct my original dictionary from the dataframe? I tried df.to_dict('list') .

Check out this answer: https://stackoverflow.com/a/24988227/9404057 . This method unpacks the keys and values of the dictionary, and reforms the data into an easily processed format for multiindex dataframes. Note that if you are using python 3.5+, you will need to use .items() rather than .iteritems() as shown in the linked answer:

>>>>import pandas as pd
>>>>reform = {(firstKey, secondKey, thirdKey): values for firstKey, middleDict in md.items() for secondKey, innerdict in middleDict.items() for thirdKey, values in innerdict.items()}
>>>>df = pd.DataFrame(reform)

To change the data type of col1 and col to int and float , you can then use pandas.DataFrame.rename() and specify any values you want:

df.rename({'col1':1, 'col2':2.5}, axis=1, level=2, inplace=True)

Also, if you'd rather have the levels on the index rather than the columns, you can also use pandas.DataFrame.T

If you wanted to reconstruct your dictionary from this MultiIndex, you could do something like this:

>>>>md2={}
>>>>for i in df.columns:
        if i[0] not in md2.keys():
            md2[i[0]]={}
        if i[1] not in md2[i[0]].keys():
            md2[i[0]][i[1]]={}
    md2[i[0]][i[1]][i[2]]=tuple(df[i[0]][i[1]][i[2]].values)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM