熊猫groupby累积/总和，平均值和标准差

Question

I have a dataframe ( df ) that is like the one below: 我有一个数据框（ df ）类似于以下内容：

month-year    name    a    b    c
2018-01       X       2    1    4
2018-01       Y       1    0    5
2018-01       X       1    6    3
2018-01       Y       4    10   7
2018-02       X       13   4    2
2018-02       Y       22   13   9
2018-02       X       3    7    4
2018-02       Y       2    15   0

I want to groupby month-year and name to get the sum of column a , average of column b , and std of column c . 我想month-year和name groupby以得到a列， b列a平均值和c列的std之和。 However, I want the sum, average, and std to be a rolling/cumulative number. 但是，我希望求和，平均值和std为滚动/累积数。

For example, for this dataset, to find the output I want for a, I can do something like 例如，对于此数据集，要查找我想要的输出，我可以做类似的事情

df.groupby(['month_year','name']).agg(sum).groupby(level=[1]).agg({'a':np.cumsum})

to get something like 得到类似的东西

month-year    name    a
2018-01       X       3
              Y       5
2018-02       X       19
              Y       29

What can I do to find the cumulative average of b and std of c to get an output that looks like this? 我该怎么做才能找到c的b和std的累积平均值，以得到如下所示的输出？

month-year    name    a    b    c
2018-01       X       3    3.5  0.71
              Y       5    5    1.41
2018-02       X       19   4.5  0.96
              Y       29   9.5  3.86

Thank you. 谢谢。

Answer 1

You can do this with expanding 您可以通过expanding来做到这一点

The first step is to calculate the expanding sum, mean and std for each of your columns, grouping only by 'name' and to join that back to the original DataFrame . 第一步是计算每个列的扩展总和，均值和标准差，仅按'name'分组并将其连接回原始DataFrame 。

Then you want to groupby and select the last row within each ['month-year', 'name'] group. 然后，您要分组，并选择每个['month-year', 'name']组中的最后一行。

df = df.join(df.groupby(['name']).expanding().agg({'a': sum, 'b': 'mean', 'c': 'std'})
               .reset_index(level=0, drop=True)
               .add_suffix('_roll'))

df.groupby(['month-year', 'name']).last().drop(columns=['a', 'b', 'c'])

Output: 输出：

                 a_roll  b_roll    c_roll
month-year name                          
2018-01    X        3.0     3.5  0.707107
           Y        5.0     5.0  1.414214
2018-02    X       19.0     4.5  0.957427
           Y       29.0     9.5  3.862210

熊猫groupby累积/总和，平均值和标准差

问题描述

1 个解决方案

解决方案1
0 2018-08-08 16:17:43

熊猫groupby累积/总和，平均值和标准差

问题描述

1 个解决方案

解决方案1 0 2018-08-08 16:17:43

解决方案1
0 2018-08-08 16:17:43