[英]How to apply different aggregation functions to same column by using pandas Groupby
It is clear when doing 这样做很明显
data.groupby(['A','B']).mean()
We get something multiindex by level 'A' and 'B' and one column with the mean of each group 我们得到了一个多级索引,一级是“A”和“B”,一列是每组的平均值
how could I have the count(), std() simultaneously ? 我怎么能同时拥有count(),std()?
so result looks like in a dataframe 所以结果在数据框中看起来像
A B mean count std
The following should work: 以下应该有效:
data.groupby(['A','B']).agg([pd.Series.mean, pd.Series.std, pd.Series.count])
basically call agg
and passing a list of functions will generate multiple columns with those functions applied. 基本上调用agg
并传递一个函数列表将生成多个列,并应用这些函数。
Example: 例:
In [12]:
df = pd.DataFrame({'a':np.random.randn(5), 'b':[0,0,1,1,2]})
df.groupby(['b']).agg([pd.Series.mean, pd.Series.std, pd.Series.count])
Out[12]:
a
mean std count
b
0 -0.769198 0.158049 2
1 0.247708 0.743606 2
2 -0.312705 NaN 1
You can also pass the string of the method names, the common ones work, some of the more obscure ones don't I can't remember which but in this case they work fine, thanks to @ajcr for the suggestion: 你也可以传递方法名称的字符串,常用的字符串,一些比较模糊的字符串我不记得哪些但是在这种情况下它们工作正常,感谢@ajcr的建议:
In [16]:
df = pd.DataFrame({'a':np.random.randn(5), 'b':[0,0,1,1,2]})
df.groupby(['b']).agg(['mean', 'std', 'count'])
Out[16]:
a
mean std count
b
0 -1.037301 0.790498 2
1 -0.495549 0.748858 2
2 -0.644818 NaN 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.