格式化熊猫数据透视表

Question

I met a problem in formatting pivot table that created by Pandas. 我遇到了由Pandas创建的格式化数据透视表的问题。 So I made a matrix table between 2 columns (A,B) from my source data, by using pandas.pivot_table with A as Column, and B as Index. 所以我在源数据的两列（A，B）之间建立了一个矩阵表，使用pandas.pivot_table，A为列，B为索引。

>> df = PD.read_excel("data.xls")
>> table = PD.pivot_table(df,index=["B"],
    values='Count',columns=["A"],aggfunc=[NUM.sum],
    fill_value=0,margins=True,dropna= True)
>> table

It returns as: 它返回为：

      sum
  A     1  2  3  All
  B 
  1     23 52  0  75
  2     16 35 12  65
  3     56  0  0  56
All     95 87 12 196

And I hope to have a format like this: 我希望有这样的格式：

                A      All_B
            1   2   3   
    1      23  52   0     75
B   2      16  35  12     65
    3      56   0   0     56
All_A      95  87  12    196

How should I do this? 我该怎么做？ Thanks very much ahead. 非常感谢。

Answer 1

The table returned by pd.pivot_table is very convenient to do work on (it's single-level index/column) and normally does NOT require any further format manipulation. pd.pivot_table返回的表非常方便（它的单级索引/列），通常不需要任何进一步的格式操作。 But if you insist on changing the format to the one you mentioned in the post, then you need to construct a multi-level index/column using pd.MultiIndex . 但是如果你坚持将格式更改为帖子中提到的格式，那么你需要使用pd.MultiIndex构建一个多级索引/列。 Here is an example on how to do it. 这是一个如何做到这一点的例子。

Before manipulation, 在操纵之前

import pandas as pd
import numpy as np

np.random.seed(0)
a = np.random.randint(1, 4, 100)
b = np.random.randint(1, 4, 100)
df = pd.DataFrame(dict(A=a,B=b,Val=np.random.randint(1,100,100)))
table = pd.pivot_table(df, index='A', columns='B', values='Val', aggfunc=sum, fill_value=0, margins=True)
print(table)


B       1     2     3   All
A                          
1     454   649   770  1873
2     628   576   467  1671
3     376   247   481  1104
All  1458  1472  1718  4648

After: 后：

multi_level_column = pd.MultiIndex.from_arrays([['A', 'A', 'A', 'All_B'], [1,2,3,'']])
multi_level_index = pd.MultiIndex.from_arrays([['B', 'B', 'B', 'All_A'], [1,2,3,'']])
table.index = multi_level_index
table.columns = multi_level_column
print(table)

            A             All_B
            1     2     3      
B     1   454   649   770  1873
      2   628   576   467  1671
      3   376   247   481  1104
All_A    1458  1472  1718  4648

格式化熊猫数据透视表

问题描述

1 个解决方案

解决方案1
1 已采纳 2015-07-11 14:01:12

格式化熊猫数据透视表

问题描述

1 个解决方案

解决方案1 1 已采纳 2015-07-11 14:01:12

解决方案1
1 已采纳 2015-07-11 14:01:12