简体   繁体   English

Python 重置列中间隔的累积总和

[英]Python reset cumulative sum over intervals in a column

I am trying to do cumulative sum by intervals ie.我正在尝试按间隔进行累积总和,即。 with cumsum being reset to zero if the next value to accumulate is 0. Below is an example with the desired result following.如果下一个要累加的值为 0,则 cumsum 被重置为零。下面是一个示例,下面是所需的结果。 I have tried using numpy 'convolve' and 'groupby' but can't get come up with a way to do the reset except by creating a def that loops over all the rows.我曾尝试使用 numpy 'convolve' 和 'groupby' 但无法想出一种方法来进行重置,除非创建一个循环遍历所有行的 def。 Is there a clever approach I'm missing?我缺少一个聪明的方法吗? Note that the real data in column 'x' are real numbers separated by 0's.请注意,“x”列中的实数数据是由 0 分隔的实数。

import numpy as np
import pandas as pd

a = pd.DataFrame([[0,0],[1,0],[1,0],[1,0],[0,0],[0,0],[0,0],[0,0],[0,0],[0,0],\
[0,0],[0,0],[0,0],[0,0],[1,0],[1,0],[0,0]], columns=["x","y"])

def patch(k):
  k["z"] = k.x.cumsum()
  return k

print(patch(a))

Current output:当前 output:

    x  y  z
0   0  0  0
1   1  0  1
2   1  0  2
3   1  0  3
4   0  0  3
6   0  0  3
7   0  0  3
9   0  0  3
10  0  0  3
12  0  0  3
13  1  0  4
15  1  0  5
16  0  0  5

Desired output:所需的 output:

    x  y  z
0   0  0  0
1   1  0  1
2   1  0  2
3   1  0  3
4   0  0  0
6   0  0  0
7   0  0  0
9   0  0  0
10  0  0  0
12  0  0  0
13  1  0  1
15  1  0  2
16  0  0  0

Do groupby on cumsum:在 cumsum 上做 groupby:

a['z'] = a.groupby(a['x'].eq(0).cumsum())['x'].cumsum()

Output: Output:

    x  y  z
0   0  0  0
1   1  0  1
2   1  0  2
3   1  0  3
4   0  0  0
6   0  0  0
7   0  0  0
9   0  0  0
10  0  0  0
12  0  0  0
13  1  0  1
15  1  0  2
16  0  0  0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM