简体   繁体   中英

Add count column to dataframe that counts when when another row changes

I have a dataframe that has a column like this:

      x
0     1
1     1
2     0
3     1
4     0
5     0
6     0
7     1
8     1
9     1

I'd like to add a column that counts up every time x changes so that my final result looks like this:

      x     y
0     1     0
1     1     0
2     0     1
3     1     2
4     0     3
5     0     3
6     0     3
7     1     4
8     1     4
9     1     4

I can't figure out the fastest way to do this without looping. I also don't care if y starts at 0 or 1. I'm sure there's something innate to pandas I can use. Can you help?

PS. the reason I need to make this y column is do be able to group the rows by each number, if there's a way to essentially accomplish the same thing without creating it, that would work too.

After diff you can apply cumsum

df.x.diff().ne(0).cumsum()-1
Out[132]: 
0    0
1    0
2    1
3    2
4    3
5    3
6    3
7    4
8    4
9    4
Name: x, dtype: int32

With Numpy arrays

Note : This generalizes to object dtype as well since we are evaluating equality.

df.assign(y=np.append(False, df.x.values[1:] != df.x.values[:-1]).cumsum())

   x  y
0  1  0
1  1  0
2  0  1
3  1  2
4  0  3
5  0  3
6  0  3
7  1  4
8  1  4
9  1  4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM