如何在聚合函数中选择多个列？

Question

I have data like this : 我有这样的数据：

A,B,C,D
1,50,1 ,3.9
2,20,22,1.5
3,10,10,2.3
2,15,11,1.8
1,16,13,4.2

and I want to group them by A that I would take mean for B and C and sum for D . 我想将它们按A分组，我将对B和C取mean ，对D求和。
the solution would be like this : 解决方案是这样的：

df = df.groupby(['A']).agg({
    'B': 'mean', 'C': 'mean', 'D': sum
})

I am asking about if there is a way to choose multiple columns for the same function rather than repeating it as in the case of B and C 我在问是否有一种方法可以为同一功能选择多个列，而不是像B和C一样重复

Answer 1

If you require at most one aggregation per column, you can store the aggregations in a dict {func: col_list} , then unpack it when you aggregate. 如果每列最多需要一个聚合，则可以将聚合存储在dict {func: col_list} ，然后在聚合时将其解压缩。

d = {'mean': ['B', 'C'], sum: ['D']}

df.groupby(['A']).agg({col: f for f,cols in d.items() for col in cols})
#      B     C    D
#A                 
#1  33.0   7.0  8.1
#2  17.5  16.5  3.3
#3  10.0  10.0  2.3

如何在聚合函数中选择多个列？

问题描述

1 个解决方案

解决方案1
0 已采纳 2019-08-08 19:44:19

如何在聚合函数中选择多个列？

问题描述

1 个解决方案

解决方案1 0 已采纳 2019-08-08 19:44:19

解决方案1
0 已采纳 2019-08-08 19:44:19