简体   繁体   中英

How to save rolling window aggregation results back into the original DataFrame?

I have the following DataFrame:

    UserID  Amount      Timestamp
50  1       600.00      2021-05-23 10:00:00
53  1       723.00      2021-05-24 05:12:00
54  2       1.00        2021-05-25 00:24:00
55  2       1000.00     2021-05-25 19:36:00
56  2       10000.00    2021-05-26 14:48:00
58  3       30.00       2021-05-27 10:00:00
60  4       50.00       2021-05-28 05:12:00
64  4       500.00      2021-05-29 00:24:00
65  4       10.00       2021-05-29 19:36:00
66  4       235.52      2021-05-30 14:48:00
69  4       567.12      2021-05-31 10:00:00

And I compute the aggregates like this:

agg = df.groupby(['UserID']).rolling('15d', on='Timestamp')['Amount'].agg(['sum', 'mean', 'std'])

What is returned cannot be added back right away to the original DataFrame. I tried with: df[['a', 'b', 'c']] = agg.values but then the data is ordered incorrectly. I don't know how (better, what's the correct way) to save rolling window aggregation results back to the original DataFrame.

First calculate the aggregates like you were already doing but also do reset_index() in the end, to get back dataframe with all other columns.

Then just apply pd.merge on this dataframe with the original dataframe on UserId, Timestamp , to add back the Amount column:

>>> df2 = df.groupby(['UserID']).rolling('15d', on='Timestamp')['Amount'].agg(['sum', 'mean', 'std']).reset_index()
>>> df = pd.merge(df, df2, on=['UserID','Timestamp'])
>>> df
    UserID    Amount           Timestamp       sum         mean          std
0        1    600.00 2021-05-23 10:00:00    600.00   600.000000          NaN
1        1    723.00 2021-05-24 05:12:00   1323.00   661.500000    86.974134
2        2      1.00 2021-05-25 00:24:00      1.00     1.000000          NaN
3        2   1000.00 2021-05-25 19:36:00   1001.00   500.500000   706.399674
4        2  10000.00 2021-05-26 14:48:00  11001.00  3667.000000  5507.237692
5        3     30.00 2021-05-27 10:00:00     30.00    30.000000          NaN
6        4     50.00 2021-05-28 05:12:00     50.00    50.000000          NaN
7        4    500.00 2021-05-29 00:24:00    550.00   275.000000   318.198052
8        4     10.00 2021-05-29 19:36:00    560.00   186.666667   272.090671
9        4    235.52 2021-05-30 14:48:00    795.52   198.880000   223.499928
10       4    567.12 2021-05-31 10:00:00   1362.64   272.528000   254.134419

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM