[英]Pandas: Cumulative sum of one column based on value of another
I am trying to calculate some statistics from a pandas dataframe. 我试图从熊猫数据框计算一些统计数据。 It looks something like this:
它看起来像这样:
id value conditional
1 10 0
2 20 0
3 30 1
1 15 1
3 5 0
1 10 1
So, I need to calculate the cumulative sum of the column value
for each id
from top to botom, but only when conditional
is 1. 因此,我需要计算从顶部到botom的每个
id
的列value
的累积和,但仅当conditional
为1时才计算。
So, this should give me something like: 所以,这应该给我一些像:
id value conditional cumulative sum
1 10 0 0
2 20 0 0
3 30 1 30
1 15 1 15
3 5 0 30
1 10 1 25
So, the sum of id=1
is taken only when conditional=1
in the 4th and 6th row and the 1st row value is not counted. 因此,只有当第4和第6行中的
conditional=1
并且不计算第1行值时,才采用id=1
的总和。 How do I do this in pandas? 我怎么在熊猫里这样做?
You can create a Series that is the multiplication of value
and conditional
, and take the cumulative sum of it for each id group: 您可以创建一个系列,它是
value
和conditional
的乘积,并为每个id组获取它的累积和:
df['cumsum'] = (df['value']*df['conditional']).groupby(df['id']).cumsum()
df
Out:
id value conditional cumsum
0 1 10 0 0
1 2 20 0 0
2 3 30 1 30
3 1 15 1 15
4 3 5 0 30
5 1 10 1 25
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.