简体   繁体   中英

Is there an efficient way to categorise rows of sequential increasing data into a group in a pandas data frame

I have a dataset that looks roughly like this (the first column being the index):

measurement value
0   1   0.617350
1   2   0.394176
2   3   0.775822
3   1   0.811693
4   2   0.621867
5   3   0.743718
6   4   0.183111
7   1   0.118586
8   2   0.274038
9   3   0.871772

My values in the second column are sequentially increasing measurement parameters, the test cycles through these measurement parameters, taking a reading at each step, before resetting and going again from the start.

The challenge I face is I need to group each cycle with a label in a fourth column.

measurement value   group
0   1   0.617350    1
1   2   0.394176    1
2   3   0.775822    1
3   1   0.811693    2
4   2   0.621867    2
5   3   0.743718    2
6   4   0.183111    2
7   1   0.118586    3
8   2   0.274038    3
9   3   0.871772    3

The only solution I can think of is to have two nested for loops, the first finding the start of each measurement condition, the second counting to the end of each measurement condition, then labelling that group. This doesn't seems to be very efficient though, I wondered if there was a better way?

If each measure starting by 1 compare values by it and add cumulative sum:

df['group'] = df['measurement'].eq(1).cumsum()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM