[英]Pandas group consecutive and label the length
I want get consecutive length labeled data我想获得连续长度标记的数据
a b
---
1 1
0 2
1 3
1 2
0 1
1 3
1 1
1 3
0 3
1 2
1 1
I want:我想:
a b | c
--------
1 1 1
0 2 0
1 3 2
1 2 2
0 1 0
1 3 3
1 1 3
1 3 3
0 2 0
1 2 2
1 1 2
then I can calculate the mean of "b" column by group "c".然后我可以按组“c”计算“b”列的平均值。 tried with shift and cumsum and cumcount all not work.尝试使用 shift 和 cumsum 和 cumcount 都不起作用。
Use GroupBy.transform
by consecutive groups and then set 0
if not 1
in a
column:按连续组使用GroupBy.transform
,然后a
列中设置0
如果不是1
:
df['c1'] = (df.groupby(df.a.ne(df.a.shift()).cumsum())['a']
.transform('size')
.where(df.a.eq(1), 0))
print (df)
a b c c1
0 1 1 1 1
1 0 2 0 0
2 1 3 2 2
3 1 2 2 2
4 0 1 0 0
5 1 3 3 3
6 1 1 3 3
7 1 3 3 3
8 0 2 0 0
9 1 2 2 2
10 1 1 2 2
If there are only 0, 1
values is possible multiple by a
:如果只有0, 1
值可能是a
的倍数:
df['c1'] = (df.groupby(df.a.ne(df.a.shift()).cumsum())['a']
.transform('size')
.mul(df.a))
print (df)
a b c c1
0 1 1 1 1
1 0 2 0 0
2 1 3 2 2
3 1 2 2 2
4 0 1 0 0
5 1 3 3 3
6 1 1 3 3
7 1 3 3 3
8 0 2 0 0
9 1 2 2 2
10 1 1 2 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.