简体   繁体   English

熊猫:基于另一列的值的一列的累积总和

[英]Pandas: Cumulative sum of one column based on value of another

I am trying to calculate some statistics from a pandas dataframe. 我试图从熊猫数据框计算一些统计数据。 It looks something like this: 它看起来像这样:

id     value     conditional
1      10        0
2      20        0
3      30        1
1      15        1
3      5         0
1      10        1

So, I need to calculate the cumulative sum of the column value for each id from top to botom, but only when conditional is 1. 因此,我需要计算从顶部到botom的每个id的列value的累积和,但仅当conditional为1时才计算。

So, this should give me something like: 所以,这应该给我一些像:

id     value     conditional   cumulative sum
1      10        0             0
2      20        0             0
3      30        1             30
1      15        1             15
3      5         0             30
1      10        1             25

So, the sum of id=1 is taken only when conditional=1 in the 4th and 6th row and the 1st row value is not counted. 因此,只有当第4和第6行中的conditional=1并且不计算第1行值时,才采用id=1的总和。 How do I do this in pandas? 我怎么在熊猫里这样做?

You can create a Series that is the multiplication of value and conditional , and take the cumulative sum of it for each id group: 您可以创建一个系列,它是valueconditional的乘积,并为每个id组获取它的累积和:

df['cumsum'] = (df['value']*df['conditional']).groupby(df['id']).cumsum()
df
Out: 
   id  value  conditional  cumsum
0   1     10            0       0
1   2     20            0       0
2   3     30            1      30
3   1     15            1      15
4   3      5            0      30
5   1     10            1      25

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM