如何使用pandas Groupby将不同的聚合函数应用于同一列

Question

It is clear when doing 这样做很明显

 data.groupby(['A','B']).mean()

We get something multiindex by level 'A' and 'B' and one column with the mean of each group 我们得到了一个多级索引，一级是“A”和“B”，一列是每组的平均值

how could I have the count(), std() simultaneously ? 我怎么能同时拥有count（），std（）？

so result looks like in a dataframe 所以结果在数据框中看起来像

A   B    mean   count   std

Answer 1

The following should work: 以下应该有效：

data.groupby(['A','B']).agg([pd.Series.mean, pd.Series.std, pd.Series.count])

basically call agg and passing a list of functions will generate multiple columns with those functions applied. 基本上调用agg并传递一个函数列表将生成多个列，并应用这些函数。

Example: 例：

In [12]:

df = pd.DataFrame({'a':np.random.randn(5), 'b':[0,0,1,1,2]})
df.groupby(['b']).agg([pd.Series.mean, pd.Series.std, pd.Series.count])
Out[12]:
          a                
       mean       std count
b                          
0 -0.769198  0.158049     2
1  0.247708  0.743606     2
2 -0.312705       NaN     1

You can also pass the string of the method names, the common ones work, some of the more obscure ones don't I can't remember which but in this case they work fine, thanks to @ajcr for the suggestion: 你也可以传递方法名称的字符串，常用的字符串，一些比较模糊的字符串我不记得哪些但是在这种情况下它们工作正常，感谢@ajcr的建议：

In [16]:
df = pd.DataFrame({'a':np.random.randn(5), 'b':[0,0,1,1,2]})
df.groupby(['b']).agg(['mean', 'std', 'count'])

Out[16]:
          a                
       mean       std count
b                          
0 -1.037301  0.790498     2
1 -0.495549  0.748858     2
2 -0.644818       NaN     1

如何使用pandas Groupby将不同的聚合函数应用于同一列

问题描述

1 个解决方案

解决方案1
3 已采纳 2015-06-05 19:56:11

如何使用pandas Groupby将不同的聚合函数应用于同一列

问题描述

1 个解决方案

解决方案1 3 已采纳 2015-06-05 19:56:11

解决方案1
3 已采纳 2015-06-05 19:56:11