简体   繁体   English

如何使 Pandas 中的数据透视表表现得像 Excel 中的数据透视表?

[英]How to make pivot table in pandas behaves like pivot table in Excel?

I'm trying to transpose the Data and it doesn't matter the aggregation method but the data was grouped by Values instead of Date我正在尝试转置数据,聚合方法无关紧要,但数据按值而不是日期分组

Code:代码:

import pandas as pd
d = {'date': ['2/21/2020', '2/21/2020','2/22/2020','2/22/2020','2/23/2020','2/23/2020'], 
     'name': ['James','John', 'James','John','James','John'],
     'A':[1,2,3,4,5,6],
     'B':[7,8,9,10,11,12],
     'C':[13,14,15,16,17,18]}
df = pd.DataFrame(data=d)
df = pd.pivot_table (df, index ='name', columns='date', values=['A','B','C'])
df

Output I get:我得到的输出:

熊猫数据框

What I need我需要的

Excel 数据透视表

Note: from Excel the Pivot table input was ('date' as Columns / 'name' as Rows / 'A','B'&'C' as Values)注意:从 Excel 中,数据透视表输入为(“日期”作为列/“名称”作为行/“A”、“B”和“C”作为值)

You'll need to use swaplevel to switch the order of the column MultiIndex so that date is on top and "A", "B", "C" is on bottom.您需要使用swaplevel来切换列 MultiIndex 的顺序,以便日期在顶部,“A”、“B”、“C”在底部。 Then you'll sort that index as well.然后,您还将对该索引进行排序。 To replace "A" with "Sum of A", I used the rename method to prefix the columns with "Sum of ".要将“A”替换为“A 之和”,我使用rename方法为列添加了“Sum of” 前缀。

new_df = (df.pivot_table(index ='name', columns='date', values=['A','B','C'])
          .swaplevel(axis=1)
          .sort_index(axis=1)
          .rename(columns="Sum of {}".format, level=1)
)

print(new_df)
date  2/21/2020                   2/22/2020                   2/23/2020                  
       Sum of A Sum of B Sum of C  Sum of A Sum of B Sum of C  Sum of A Sum of B Sum of C
name                                                                                     
James         1        7       13         3        9       15         5       11       17
John          2        8       14         4       10       16         6       12       18

To get the similar output, we can use margins , swaplevel .为了获得类似的输出,我们可以使用marginsswaplevel After that, we can rename the columns with mapper .之后,我们可以使用mapper重命名列。 In the end, .iloc[:, :-3] is for removing the additional row margins, you can remove if you want to have row margins.最后, .iloc[:, :-3]用于删除额外的行边距,如果您想要行边距,可以删除。 :

df1 = (df.pivot( index=['name'],  columns = 'date', margins=True, margins_name='Grand Total',  aggfunc=np.sum)
      .swaplevel(axis=1)
      .sort_index(axis=1)
      .rename(mapper=lambda x: f'Sum of {x}',axis=1,level=1)
      .iloc[:, :-3])


print(df1)

output:输出:

date        2/21/2020                   2/22/2020                   2/23/2020                  
             Sum of A Sum of B Sum of C  Sum of A Sum of B Sum of C  Sum of A Sum of B Sum of C
name                                                                                           
James               1        7       13         3        9       15         5       11       17
John                2        8       14         4       10       16         6       12       18
Grand Total         3       15       27         7       19       31        11       23       35

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM