简体   繁体   中英

How to make pivot table in pandas behaves like pivot table in Excel?

I'm trying to transpose the Data and it doesn't matter the aggregation method but the data was grouped by Values instead of Date

Code:

import pandas as pd
d = {'date': ['2/21/2020', '2/21/2020','2/22/2020','2/22/2020','2/23/2020','2/23/2020'], 
     'name': ['James','John', 'James','John','James','John'],
     'A':[1,2,3,4,5,6],
     'B':[7,8,9,10,11,12],
     'C':[13,14,15,16,17,18]}
df = pd.DataFrame(data=d)
df = pd.pivot_table (df, index ='name', columns='date', values=['A','B','C'])
df

Output I get:

熊猫数据框

What I need

Excel 数据透视表

Note: from Excel the Pivot table input was ('date' as Columns / 'name' as Rows / 'A','B'&'C' as Values)

You'll need to use swaplevel to switch the order of the column MultiIndex so that date is on top and "A", "B", "C" is on bottom. Then you'll sort that index as well. To replace "A" with "Sum of A", I used the rename method to prefix the columns with "Sum of ".

new_df = (df.pivot_table(index ='name', columns='date', values=['A','B','C'])
          .swaplevel(axis=1)
          .sort_index(axis=1)
          .rename(columns="Sum of {}".format, level=1)
)

print(new_df)
date  2/21/2020                   2/22/2020                   2/23/2020                  
       Sum of A Sum of B Sum of C  Sum of A Sum of B Sum of C  Sum of A Sum of B Sum of C
name                                                                                     
James         1        7       13         3        9       15         5       11       17
John          2        8       14         4       10       16         6       12       18

To get the similar output, we can use margins , swaplevel . After that, we can rename the columns with mapper . In the end, .iloc[:, :-3] is for removing the additional row margins, you can remove if you want to have row margins. :

df1 = (df.pivot( index=['name'],  columns = 'date', margins=True, margins_name='Grand Total',  aggfunc=np.sum)
      .swaplevel(axis=1)
      .sort_index(axis=1)
      .rename(mapper=lambda x: f'Sum of {x}',axis=1,level=1)
      .iloc[:, :-3])


print(df1)

output:

date        2/21/2020                   2/22/2020                   2/23/2020                  
             Sum of A Sum of B Sum of C  Sum of A Sum of B Sum of C  Sum of A Sum of B Sum of C
name                                                                                           
James               1        7       13         3        9       15         5       11       17
John                2        8       14         4       10       16         6       12       18
Grand Total         3       15       27         7       19       31        11       23       35

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM