简体   繁体   中英

Pandas groupby cumulative/rolling sum,average, and std

I have a dataframe ( df ) that is like the one below:

month-year    name    a    b    c
2018-01       X       2    1    4
2018-01       Y       1    0    5
2018-01       X       1    6    3
2018-01       Y       4    10   7
2018-02       X       13   4    2
2018-02       Y       22   13   9
2018-02       X       3    7    4
2018-02       Y       2    15   0

I want to groupby month-year and name to get the sum of column a , average of column b , and std of column c . However, I want the sum, average, and std to be a rolling/cumulative number.

For example, for this dataset, to find the output I want for a, I can do something like

df.groupby(['month_year','name']).agg(sum).groupby(level=[1]).agg({'a':np.cumsum})

to get something like

month-year    name    a
2018-01       X       3
              Y       5
2018-02       X       19
              Y       29

What can I do to find the cumulative average of b and std of c to get an output that looks like this?

month-year    name    a    b    c
2018-01       X       3    3.5  0.71
              Y       5    5    1.41
2018-02       X       19   4.5  0.96
              Y       29   9.5  3.86

Thank you.

You can do this with expanding

The first step is to calculate the expanding sum, mean and std for each of your columns, grouping only by 'name' and to join that back to the original DataFrame .

Then you want to groupby and select the last row within each ['month-year', 'name'] group.

df = df.join(df.groupby(['name']).expanding().agg({'a': sum, 'b': 'mean', 'c': 'std'})
               .reset_index(level=0, drop=True)
               .add_suffix('_roll'))

df.groupby(['month-year', 'name']).last().drop(columns=['a', 'b', 'c'])

Output:

                 a_roll  b_roll    c_roll
month-year name                          
2018-01    X        3.0     3.5  0.707107
           Y        5.0     5.0  1.414214
2018-02    X       19.0     4.5  0.957427
           Y       29.0     9.5  3.862210

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM