简体   繁体   English

来自熊猫数据框嵌套字典的熊猫数据框

[英]Pandas Dataframe from nested dictionary of pandas dataframes

I have a dictionary with keys of 2 levels, and values at the second level being dataframes:我有一个带有 2 个级别键的字典,第二个级别的值是数据帧:

my_dict = {
           'elem1':{'day1': pd.DataFrame(columns=['Col1', 'Col2']),
                    'day2': pd.DataFrame(columns=['Col1', 'Col2'])
                   },
           'elem2':{'day1': pd.DataFrame(columns=['Col1', 'Col2']),
                    'day2': pd.DataFrame(columns=['Col1', 'Col2'])
                    'day3': pd.DataFrame(columns=['Col1', 'Col2'])
                   }
          }

How do I convert this to a multi-index pandas dataframe of the form:如何将其转换为以下形式的多索引熊猫数据框:

                 Col1    Col2
elem1    day1    ...      ...
         day2    ...      ...
elem2    day1    ...      ...
         day2    ...      ...

I have looked through these answers but am unable to stitch together a solution:我已经浏览了这些答案,但无法拼凑出一个解决方案:

Idea is create tuples by both keys and pass to concat , third level of MultiIndex is created from index values of original DataFrame s, if necessary you can remove it:想法是通过两个键创建元组并传递给concat ,第三级MultiIndex是从原始DataFrame的索引值创建的,如有必要,您可以将其删除:

my_dict = {
           'elem1':{'day1': pd.DataFrame(1, columns=['Col1', 'Col2'], index=[1,2]),
                    'day2': pd.DataFrame(2, columns=['Col1', 'Col2'], index=[1,2])
                   },
           'elem2':{'day1': pd.DataFrame(3, columns=['Col1', 'Col2'], index=[1,2]),
                    'day2': pd.DataFrame(4, columns=['Col1', 'Col2'], index=[1,2]),
                    'day3': pd.DataFrame(5, columns=['Col1', 'Col2'], index=[1,2])
                   }
          }

d = {(k1, k2): v2 for k1, v1 in my_dict.items() for k2, v2 in v1.items()}
print (d)
{('elem1', 'day1'):    Col1  Col2
1     1     1
2     1     1, ('elem1', 'day2'):    Col1  Col2
1     2     2
2     2     2, ('elem2', 'day1'):    Col1  Col2
1     3     3
2     3     3, ('elem2', 'day2'):    Col1  Col2
1     4     4
2     4     4, ('elem2', 'day3'):    Col1  Col2
1     5     5
2     5     5}

df = pd.concat(d, sort=False)
print (df)
              Col1  Col2
elem1 day1 1     1     1
           2     1     1
      day2 1     2     2
           2     2     2
elem2 day1 1     3     3
           2     3     3
      day2 1     4     4
           2     4     4
      day3 1     5     5
           2     5     5

df = pd.concat(d, sort=False).reset_index(level=2, drop=True)
print (df)
            Col1  Col2
elem1 day1     1     1
      day1     1     1
      day2     2     2
      day2     2     2
elem2 day1     3     3
      day1     3     3
      day2     4     4
      day2     4     4
      day3     5     5
      day3     5     5

Try like this:像这样尝试:

my_dict = {
           'elem1':{'day1': pd.DataFrame(columns=['Col1', 'Col2']),
                    'day2': pd.DataFrame(columns=['Col1', 'Col2'])
                   },
           'elem2':{'day1': pd.DataFrame(columns=['Col1', 'Col2']),
                    'day2': pd.DataFrame(columns=['Col1', 'Col2'])
                    'day3': pd.DataFrame(columns=['Col1', 'Col2'])
                   }
          }
nd = {}
for x in my_dict:
  nd.update(my_dict[x])
df = pd.DataFrame(nd,index=my_dict.keys())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM