简体   繁体   English

Pivot_table MultiIndex 到列

[英]Pivot_table MultiIndex to columns

I have the following table :我有下表:

In [303]: table.head()
Out[303]: 
            people  weekday  weekofyear
2012-01-01     119        6          52
2012-01-02      76        0           1
2012-01-03      95        1           1
2012-01-04     102        2           1
2012-01-05      87        3           1

I would like to create a simple pd.DataFrame where :我想创建一个简单的pd.DataFrame其中:

  • columns = [1, 2, ..., 52] (weekofyear)列 = [1, 2, ..., 52] (weekofyear)
  • rows = [0, 1, ..., 6] (weekday)行 = [0, 1, ..., 6](工作日)
  • values = np.sum值 = np.sum

I tried using pd.pivot_table which gave me the expected result :我尝试使用pd.pivot_table这给了我预期的结果:

In [308]: p = pd.pivot_table(table, index=["weekday"], columns=["weekofyear"], values=["people"], aggfunc=[np.sum])
     ...: p
     ...: 
Out[308]: 
              sum                                             ...             \
           people                                             ...              
weekofyear     1    2    3    4    5    6    7    8   9    10 ...    43   44   
weekday                                                       ...              
0             162   86   84   95   92   98  108  102  97   87 ...   108   86   
1              95  113   88   78  108  112   98  104  87  105 ...    85   82   
2             102   70   93   82  103   80  103   85  82   96 ...    87  105   
3              87   91  101   83   91  100  100   80  89   86 ...    87   91   
4             111   91  110  103   93  116  110   99  78   77 ...    83  102   
5             117  107   99   88   97   90  100   91  97   88 ...   103  110   
6              92   95   90   86   91  103   98  100  89   96 ...    94  101   



weekofyear   45   46   47   48   49   50   51   52  
weekday                                             
0            99   92   99   83  107  106   93  107  
1           105   83  101   93  102   89  113   84  
2            96   84  110   83  104   84   84  116  
3            87   96   87   88   88   83  113   93  
4            93   81  104  108   72  101  109   97  
5            81  107   97   89   86  108  113  101  
6            93   92   93   91   89   96   93  226  

[7 rows x 52 columns]

but instead of having my weekofyears columns, I got stuck with a MultiIndex I could not get rid of.但是,我没有使用Weekofyears列,而是陷入了无法摆脱的 MultiIndex。 As shown below :如下所示 :

In [309]: p.columns
Out[309]: 
MultiIndex(levels=[['sum'], ['people'], [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52]],
           labels=[[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51]],
           names=[None, None, 'weekofyear']

while the index seems fine :虽然索引看起来不错:

In [311]: p.index
Out[311]: Int64Index([0, 1, 2, 3, 4, 5, 6], dtype='int64', name='weekday'  

I tried playing with unstack() & reset_index() functions, without success.我试图打unstack() reset_index()函数,但没有成功。

Am I missing something ?我错过了什么吗?

Instead of giving lists to values and aggfunc , you should try giving single values to them.与其为valuesaggfunc提供列表, aggfunc尝试为它们提供单个值。 Example -例子 -

p = pd.pivot_table(table, index=["weekday"], columns=["weekofyear"], values="people", aggfunc=np.sum)

Demo -演示 -

In [3]: table
Out[3]:
            people  weekday  weekofyear
2012-01-01     119        6          52
2012-01-02      76        0           1
2012-01-03      95        1           1
2012-01-04     102        2           1
2012-01-05      87        3           1

In [12]: p = pd.pivot_table(table, index=["weekday"], columns=["weekofyear"], values="people", aggfunc=np.sum)

In [13]: p
Out[13]:
weekofyear   1    52
weekday
0            76  NaN
1            95  NaN
2           102  NaN
3            87  NaN
6           NaN  119

In [14]: p.columns
Out[14]: Int64Index([1, 52], dtype='int64', name='weekofyear')

From documentation -文档 -

aggfunc : function, default numpy.mean, or list of functions aggfunc : 函数,默认 numpy.mean,或函数列表
If list of functions passed, the resulting pivot table will have hierarchical columns whose top level are the function names (inferred from the function objects themselves)如果传递了函数列表,则生成的数据透视表将具有分层列,其顶层是函数名称(从函数对象本身推断)

Similar is the case with values , though not specifically mentioned in the documentationvalues的情况类似,尽管文档中没有特别提到

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM