Pandas group consecutive and label the length

Question

I want get consecutive length labeled data

I want:

then I can calculate the mean of "b" column by group "c". tried with shift and cumsum and cumcount all not work.

Answer 1

Use GroupBy.transform by consecutive groups and then set 0 if not 1 in a column:

df['c1'] = (df.groupby(df.a.ne(df.a.shift()).cumsum())['a']
              .transform('size')
              .where(df.a.eq(1), 0))
print (df)
    a  b  c  c1
0   1  1  1   1
1   0  2  0   0
2   1  3  2   2
3   1  2  2   2
4   0  1  0   0
5   1  3  3   3
6   1  1  3   3
7   1  3  3   3
8   0  2  0   0
9   1  2  2   2
10  1  1  2   2

If there are only 0, 1 values is possible multiple by a :

df['c1'] = (df.groupby(df.a.ne(df.a.shift()).cumsum())['a']
              .transform('size')
              .mul(df.a))
print (df)
    a  b  c  c1
0   1  1  1   1
1   0  2  0   0
2   1  3  2   2
3   1  2  2   2
4   0  1  0   0
5   1  3  3   3
6   1  1  3   3
7   1  3  3   3
8   0  2  0   0
9   1  2  2   2
10  1  1  2   2

Pandas group consecutive and label the length

Question

1 answers

solution1
0 2022-08-17 05:49:40

Pandas group consecutive and label the length

Question

1 answers

solution1 0 2022-08-17 05:49:40

solution1
0 2022-08-17 05:49:40