简体   繁体   中英

multiple cumulative sum based on grouped columns

I have a dataset where I would like to sum two columns and then perform a subtraction while displaying a cumulative sum.

Data

id   date   t1  t2  total   start   cur_t1  cur_t2  final_o finaldb de_t1   de_t2
a    q122   4   1   5       50      25      20      55      21      1       1
a    q222   1   1   2       50      25      20      57      22      0       0
a    q322   0   0   0       50      25      20      57      22      5       5
b    q122   5   5   10      100     30      40      110     27      4       4
b    q222   2   2   4       100     30      70      114     29      5       1
b    q322   3   4   7       100     30      70      121     33      0       1

Desired

id date t1  t2  total   start   cur_t1  cur_t2  final_o finaldb de_t1   de_t2   finalt1
a  q122 4   1   5       50      25      20      55      21      1       1       28
a  q222 1   1   2       50      25      20      57      22      0       0       29
a  q322 0   0   0       50      25      20      57      22      5       5       24
b  q122 5   5   10      100     30      40      110     27      4       4       31
b  q222 2   2   4       100     30      70      114     29      5       1       28
b  q322 3   4   7       100     30      70      121     33      0       1       31

Logic

Create 'finalt1' column by summing 't1' and 'cur_t1' 
initially and then subtracting 'de_t1' cumulatively and grouping by 'id' and 'date'

Doing

df['finalt1'] = df['cur_t1'].add(df.groupby('id')['t1'].cumsum())

I am still researching on how to subtract the 'de_t1' column cumulatively.

I can't test right now, but logically:

(df['cur_t1'].add(df.groupby('id')['t1'].cumsum())
             .sub(df.groupby('id')['de_t1'].cumsum())
)

Of note, there was also this possibility to avoid grouping twice (it is calculating both cumsums at once and computing the difference), but it is actually slower:

df['cur_t1'].add(df.groupby('id')[['de_t1', 't1']].cumsum().diff(axis=1)['t1'])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM