[英]how to choose multiple columns in aggregate functions?
I have data like this : 我有这样的数据:
A,B,C,D
1,50,1 ,3.9
2,20,22,1.5
3,10,10,2.3
2,15,11,1.8
1,16,13,4.2
and I want to group them by A
that I would take mean
for B
and C
and sum for D
. 我想将它们按
A
分组,我将对B
和C
取mean
,对D
求和。
the solution would be like this : 解决方案是这样的:
df = df.groupby(['A']).agg({
'B': 'mean', 'C': 'mean', 'D': sum
})
I am asking about if there is a way to choose multiple columns for the same function rather than repeating it as in the case of B
and C
我在问是否有一种方法可以为同一功能选择多个列,而不是像
B
和C
一样重复
If you require at most one aggregation per column, you can store the aggregations in a dict {func: col_list}
, then unpack it when you aggregate. 如果每列最多需要一个聚合,则可以将聚合存储在dict
{func: col_list}
,然后在聚合时将其解压缩。
d = {'mean': ['B', 'C'], sum: ['D']}
df.groupby(['A']).agg({col: f for f,cols in d.items() for col in cols})
# B C D
#A
#1 33.0 7.0 8.1
#2 17.5 16.5 3.3
#3 10.0 10.0 2.3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.