简体   繁体   English

将元组列表转换为具有多级列的数据框

[英]converting a list of tuples into a dataframe with multi-level columns

I have a list of namedtuples that I want to convert to a dataframe.我有一个要转换为数据框的命名元组列表。 The tuples are like this:元组是这样的:

s = pd.Series({'A': 1, 'B': 2}, 
              pd.Index([u'A', u'B'], 
                       name=u'submission_label'))

SingleExperimentStatistics = namedtuple('SingleExperimentStatistics', 
['metric_name', 'z_score', 'average'])

res = SingleExperimentStatistics(
  metric_name=None, 
  z_score=1.1826795129064109, 
  average=s,
)

Calling pd.Dataframe([res, res]) gives us调用pd.Dataframe([res, res])给我们

   metric_name  z_score  average
0  None         1.18268  submission_label A 1 B 2 dtype: int64
1  None         1.18268  submission_label A 1 B 2 dtype: int64 

But what I want is a pivot table with MultiIndex column where A and B are column names.但我想要的是一个带有MultiIndex列的数据透视表,其中AB是列名。 Basically, something like this:基本上,是这样的:

   metric_name  z_score  average
                         A   B   
0  None         1.18268  1   2
1  None         1.18268  1   2 

What is the right way of doing this?这样做的正确方法是什么?

I would prefer to work with a simple index by breaking those series into two columns:我更愿意通过将这些系列分成两列来使用简单的索引:

import pandas as pd
from collections import namedtuple
s = pd.Series({'A': 1, 'B': 2}, 
              pd.Index([u'A', u'B'], 
                       name=u'submission_label'))
SingleExperimentStatistics = namedtuple('SingleExperimentStatistics', 
    ['metric_name', 'z_score', 'average'])
res = SingleExperimentStatistics(
  metric_name=None, 
  z_score=1.1826795129064109, 
  average=s,
)

df = pd.DataFrame([res, res])
df1 = df.loc[:, ['metric_name', 'z_score']]
df1['A'] = df['average'].apply(lambda x: x['A'])
df1['B'] = df['average'].apply(lambda x: x['B'])
print(df1)

  metric_name  z_score  A  B
0        None  1.18268  1  2
1        None  1.18268  1  2

If you really want the multi-index, you can define it at this step:如果你真的想要多索引,你可以在这一步定义它:

index = pd.MultiIndex.from_tuples(zip(['metric_name', 'z_score', 'average', 'average'],
                         ['' ,'', 'A', 'B']))
df1.columns = index
print(df1)

  metric_name  z_score average   
                             A  B
0        None  1.18268       1  2
1        None  1.18268       1  2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM