简体   繁体   English

将嵌套字典转换为多级列 dataframe

[英]Convert nested dictionary to multilevel column dataframe

I have a dictionary which I want to convert to multilevel column dataframe and the index will be the most outer keys of the dictionary.我有一个字典,我想将它转换为多级列dataframe并且索引将是字典的最外键。

my_dict = {'key1': {'sub-key1': {'sub-sub-key1':'a','sub-sub-key2':'b'}, 'sub-key2': {'sub-sub-key1':'aa','sub-sub-key2':'bb'}},
    'key2': {'sub-key1': {'sub-sub-key1':'c','sub-sub-key2':'d'}, 'sub-key2': {'sub-sub-key1':'cc','sub-sub-key2':'dd'}}}

My desired output should look like:我想要的 output 应该如下所示:

               sub-key1                        sub-key2
    sub-sub-key1    sub-sub-key2     sub-sub-key1    sub-sub-key2
key1    a               b                aa               bb
key2    c               d                cc               dd

I tried to use concat with pd.concat({k: pd.DataFrame.from_dict(my_dict, orient='index') for k, v in d.items()}, axis=1) but the result is not as expected.我尝试将concatpd.concat({k: pd.DataFrame.from_dict(my_dict, orient='index') for k, v in d.items()}, axis=1)使用,但结果不如预期.

I also tried to reform the dictionary.我也尝试改过字典。

reformed_dict = {}
for outerKey, innerDict in my_dict.items():
    for innerKey, values in innerDict.items():
        reformed_dict[(outerKey, innerKey)] = values
pd.DataFrame(reformed_dict)

Again the result was not ok.结果还是不行。 The highest level column and index are interchanged.最高级别的列和索引是互换的。

Is there any other way to do this?有没有其他方法可以做到这一点?

You were pretty close with concat , need to unstack after so like您与concat非常接近,因此需要unstack

res = pd.concat({k: pd.DataFrame.from_dict(v, orient='columns') 
                 for k, v in my_dict.items()}
         ).unstack()
print(res)
#          sub-key1                  sub-key2             
#      sub-sub-key1 sub-sub-key2 sub-sub-key1 sub-sub-key2
# key1            a            b           aa           bb
# key2            c            d           cc           dd

Try this one-liner that usespd.concat , dict comphrension and pd.from_dict to format the dataframe, and pd.unstack to adjust the dataframe structure.试试这个使用pd.concat 、 dict comphrension 和pd.from_dict格式化 dataframe 和pd.unstack来调整 dataframe 结构的单行程序。

df = pd.concat({k: pd.DataFrame.from_dict(v) for k, v in my_dict.items()}).unstack()

result:结果:

              sub-key1                        sub-key2
    sub-sub-key1    sub-sub-key2    sub-sub-key1    sub-sub-key2
key1    a                 b              aa                bb
key2    c                 d              cc                dd

First reform your dictionary:首先改革你的字典:

>>> reform = {(outerKey, innerKey): values for index, outerDict in my_dict.items() for outerKey, innerDict in outerDict.items() for innerKey, values in innerDict.items()}
>>> reform
{('sub-key1', 'sub-sub-key1'): 'c',
 ('sub-key1', 'sub-sub-key2'): 'd',
 ('sub-key2', 'sub-sub-key1'): 'cc',
 ('sub-key2', 'sub-sub-key2'): 'dd'}

Then make a dataframe with the right index names:然后使用正确的索引名称制作 dataframe:

>>> df = pd.DataFrame(reform, index=d.keys())
>>> df
         sub-key1                  sub-key2             
     sub-sub-key1 sub-sub-key2 sub-sub-key1 sub-sub-key2
key1            c            d           cc           dd
key2            c            d           cc           dd

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM