[英]Add count column to dataframe that counts when when another row changes
I have a dataframe that has a column like this: 我有一个数据框,其中包含如下列:
x
0 1
1 1
2 0
3 1
4 0
5 0
6 0
7 1
8 1
9 1
I'd like to add a column that counts up every time x
changes so that my final result looks like this: 我想添加一个每次
x
更改时都会计数的列,以便我的最终结果如下所示:
x y
0 1 0
1 1 0
2 0 1
3 1 2
4 0 3
5 0 3
6 0 3
7 1 4
8 1 4
9 1 4
I can't figure out the fastest way to do this without looping. 如果没有循环,我无法找到最快的方法。 I also don't care if
y
starts at 0 or 1. I'm sure there's something innate to pandas I can use. 如果
y
从0或1开始,我也不在乎。我确信我可以使用的是大熊猫的天赋。 Can you help? 你能帮我吗?
PS. PS。 the reason I need to make this
y
column is do be able to group the rows by each number, if there's a way to essentially accomplish the same thing without creating it, that would work too. 我需要制作这个
y
列的原因是能够按每个数字对行进行分组,如果有一种方法可以基本上完成同样的事情而不创建它,那也可以。
After diff
you can apply cumsum
diff
你可以应用cumsum
df.x.diff().ne(0).cumsum()-1
Out[132]:
0 0
1 0
2 1
3 2
4 3
5 3
6 3
7 4
8 4
9 4
Name: x, dtype: int32
Note : This generalizes to object
dtype as well since we are evaluating equality. 注意 :这也是
object
dtype的推广,因为我们正在评估相等性。
df.assign(y=np.append(False, df.x.values[1:] != df.x.values[:-1]).cumsum())
x y
0 1 0
1 1 0
2 0 1
3 1 2
4 0 3
5 0 3
6 0 3
7 1 4
8 1 4
9 1 4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.