Pandas cumsum by chunk

Question

In dataset, I have two columns

N: ID number to identify each row
Indicator: it is either 0 or 1.

What I would like to obtain:

Cumsum: calculate the cumulative cum of the column Indicator, but only to successive values of 1.
Total: then for each chunk of non-null values, get the total of non-null values (or the max of the cum sum, or the last value) for each chunk

How can I get the two columns efficiently?

(A for loop over the rows would not be efficient.)

Answer 1

Example

we need code of example for answer

df = pd.DataFrame([0, 0, 1, 1, 1, 0, 0, 1, 1], columns=['Ind'])

df

Code

g = df['Ind'].ne(df['Ind'].shift()).cumsum()
df['Cumsum'] = df.groupby(g).cumsum()
df['Total'] = df.groupby(g)['Cumsum'].transform(max)

df

    Ind Cumsum  Total
0   0   0.0     0.0
1   0   0.0     0.0
2   1   1.0     3.0
3   1   2.0     3.0
4   1   3.0     3.0
5   0   0.0     0.0
6   0   0.0     0.0
7   1   1.0     2.0
8   1   2.0     2.0

Answer 2

Something with the same logic

s = df['indicator'].eq(0).cumsum()
df['new1'] = df.groupby(s).cumcount()
df['new2'] = df.groupby(s)['indicator'].transform('sum')*df['indicator']
df
Out[458]: 
   indicator  new1  new2
0          0     0     0
1          0     0     0
2          1     1     3
3          1     2     3
4          1     3     3
5          0     0     0
6          0     0     0
7          1     1     2
8          1     2     2

Pandas cumsum by chunk

Question

2 answers

solution1
1 2022-12-26 13:32:40

solution2
0 2022-12-26 15:32:27

Pandas cumsum by chunk

Question

2 answers

solution1 1 2022-12-26 13:32:40

solution2 0 2022-12-26 15:32:27

solution1
1 2022-12-26 13:32:40

solution2
0 2022-12-26 15:32:27