I have a dataframe like the following:
col1 col2
0 1 True
1 3 True
2 3 True
3 1 False
4 2 True
5 3 True
6 2 False
7 2 True
I want to get a running sum of True
values. Whenever I see a False
value in col2
, I need to take the cumulative sum of col1
up to that point. So, the DataFrame would look like the following:
col1 col2 col3
0 1 True 0
1 3 True 0
2 3 True 0
3 1 False 7
4 2 True 0
5 3 True 0
6 2 False 5
7 2 True 0
How can I do this?
You can use more general solution which works nice with multiple consecutive False
- then cumulative sum value is not changed:
a = df.groupby((df.col2 != df.col2.shift()).cumsum())['col1'].transform('sum')
df['d'] = a.where(df.col2).ffill().mask(df.col2).fillna(0).astype(int)
print (df)
col1 col2 d
0 1 True 0
1 3 True 0
2 3 True 0
3 1 False 7
4 2 True 0
5 3 True 0
6 2 False 5
7 2 True 0
#added 2 last rows with False in col2
print (df)
col1 col2
0 1 True
1 3 True
2 3 True
3 1 False
4 2 True
5 3 True
6 2 False
7 2 True
8 4 False
9 4 False
a = df.groupby((df.col2 != df.col2.shift()).cumsum())['col1'].transform('sum')
df['d'] = a.where(df.col2).ffill().mask(df.col2).fillna(0).astype(int)
print (df)
col1 col2 d
0 1 True 0
1 3 True 0
2 3 True 0
3 1 False 7
4 2 True 0
5 3 True 0
6 2 False 5
7 2 True 0
8 4 False 2
9 4 False 2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.