简体   繁体   English

在 pandas 数据帧中,是否有一种有效的方法可以将连续增加的数据行分类为一组

[英]Is there an efficient way to categorise rows of sequential increasing data into a group in a pandas data frame

I have a dataset that looks roughly like this (the first column being the index):我有一个大致如下所示的数据集(第一列是索引):

measurement value
0   1   0.617350
1   2   0.394176
2   3   0.775822
3   1   0.811693
4   2   0.621867
5   3   0.743718
6   4   0.183111
7   1   0.118586
8   2   0.274038
9   3   0.871772

My values in the second column are sequentially increasing measurement parameters, the test cycles through these measurement parameters, taking a reading at each step, before resetting and going again from the start.我在第二列中的值是按顺序增加测量参数,测试循环通过这些测量参数,在每一步读取读数,然后重新设置并从头开始。

The challenge I face is I need to group each cycle with a label in a fourth column.我面临的挑战是我需要在第四列中使用 label 对每个周期进行分组。

measurement value   group
0   1   0.617350    1
1   2   0.394176    1
2   3   0.775822    1
3   1   0.811693    2
4   2   0.621867    2
5   3   0.743718    2
6   4   0.183111    2
7   1   0.118586    3
8   2   0.274038    3
9   3   0.871772    3

The only solution I can think of is to have two nested for loops, the first finding the start of each measurement condition, the second counting to the end of each measurement condition, then labelling that group.我能想到的唯一解决方案是有两个嵌套的 for 循环,第一个找到每个测量条件的开始,第二个计数到每个测量条件的结尾,然后标记该组。 This doesn't seems to be very efficient though, I wondered if there was a better way?这似乎不是很有效,我想知道是否有更好的方法?

If each measure starting by 1 compare values by it and add cumulative sum:如果每个度量从1开始比较它的值并添加累积总和:

df['group'] = df['measurement'].eq(1).cumsum()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM