python pandas 按不同轴上的总和和平均值分组

Question

I need to group my data and calculate mean on one axis and sum on another.我需要对我的数据进行分组并在一个轴上计算平均值并在另一个轴上求和。 I've been looking for similar questions but I can't find a proper solution.我一直在寻找类似的问题，但找不到合适的解决方案。

I have a similar df:我有一个类似的df：

df = pd.DataFrame ({'A': ['XX','XX','XX','XX','XX','XX','XX','XX','XX',
                          'YY','YY','YY','YY','YY','YY','YY','YY','YY',
                          'ZZ','ZZ','ZZ','ZZ','ZZ','ZZ','ZZ','ZZ','ZZ'],
                    
                    'B': ['ind1','ind2','ind3','ind1','ind2','ind3','ind1','ind2','ind3',
                          'ind1','ind2','ind3','ind1','ind2','ind3','ind1','ind2','ind3',
                          'ind1','ind2','ind3','ind1','ind2','ind3','ind1','ind2','ind3'],   
                                        
                    'C': ['2017','2017','2017','2018','2018','2018','2019','2019','2019',
                          '2017','2017','2017','2018','2018','2018','2019','2019','2019',
                          '2017','2017','2017','2018','2018','2018','2019','2019','2019'],
                    
                    'D': np.random.randint(0,100,size=27)})

I need the following df:我需要以下df：

A   ind1    ind2    ind3    TOTAL
XX  52.33   73.00   37.00   162.33
YY  40.67   51.33   54.33   146.33
ZZ  84.00   28.67   62.00   174.67

Where columns ind1, ind2, ind3 are means by axis =0, while TOTAL is the sum of ind1,ind2,ind3 by axis = 1其中 ind1、ind2、ind3 列是轴 = 0 的平均值，而 TOTAL 是轴 = 1 的 ind1、ind2、ind3 的总和

I tried the following but is not working:我尝试了以下但不工作：

print(df.groupby('A')['D'].agg(['sum','mean']))

Any help would be fantastic.任何帮助都会很棒。

Answer 1

I believe you need pivoting by crosstab or DataFrame.pivot_table and then add new column with sums by DataFrame.assign :我相信您需要通过crosstab或DataFrame.pivot_table进行旋转，然后添加带有总和的新列DataFrame.assign ：

np.random.seed(20)
    
df = pd.DataFrame ({'A': ['XX','XX','XX','XX','XX','XX','XX','XX','XX',
                          'YY','YY','YY','YY','YY','YY','YY','YY','YY',
                          'ZZ','ZZ','ZZ','ZZ','ZZ','ZZ','ZZ','ZZ','ZZ'],
                    
                    'B': ['ind1','ind2','ind3','ind1','ind2','ind3','ind1','ind2','ind3',
                          'ind1','ind2','ind3','ind1','ind2','ind3','ind1','ind2','ind3',
                          'ind1','ind2','ind3','ind1','ind2','ind3','ind1','ind2','ind3'],   
                                        
                    'C': ['2017','2017','2017','2018','2018','2018','2019','2019','2019',
                          '2017','2017','2017','2018','2018','2018','2019','2019','2019',
                          '2017','2017','2017','2018','2018','2018','2019','2019','2019'],
                    
                    'D': np.random.randint(0,100,size=27)})

df = (pd.crosstab(df['A'], df['B'], df['D'], aggfunc='mean')
        .assign(Total = lambda x: x.sum(axis=1)))

print (df)
B        ind1       ind2       ind3       Total
A                                              
XX  67.666667  46.000000  60.000000  173.666667
YY  69.333333  45.666667  67.333333  182.333333
ZZ  16.333333  57.666667  32.333333  106.333333

Or:或者：

df = (df.pivot_table(index='A',columns='B',values='D')
        .assign(Total = lambda x: x.sum(axis=1)))

Answer 2

This is another method if you are not familiar with cross_tab or pivot table.如果您不熟悉 cross_tab 或 pivot 表，这是另一种方法。

df_n = df.groupby(['A','B'])['D'].mean().unstack()
df_n['Total'] = df_n.sum(axis=1)

Output would be: Output 将是：

B        ind1       ind2       ind3       Total
A                                              
XX  67.666667  46.000000  60.000000  173.666667
YY  69.333333  45.666667  67.333333  182.333333
ZZ  16.333333  57.666667  32.333333  106.333333

python pandas 按不同轴上的总和和平均值分组

问题描述

2 个解决方案

解决方案1
1 已采纳 2020-07-22 06:47:15

解决方案2
0 2020-07-22 10:04:30

python pandas 按不同轴上的总和和平均值分组

问题描述

2 个解决方案

解决方案1 1 已采纳 2020-07-22 06:47:15

解决方案2 0 2020-07-22 10:04:30

解决方案1
1 已采纳 2020-07-22 06:47:15

解决方案2
0 2020-07-22 10:04:30