简体   繁体   中英

Pandas Add an incremental number based on another column

Consider a dataframe with a column like this:

sequence 
1
2
3
4
5
1
2
3
1
2
3
4
5
6
7

I wish to create a column when the sequence resets. The sequence is of variable length.

Such that I'd get something like:

sequence run
1 1
2 1
3 1
4 1
5 1
1 2
2 2
3 2
1 3
2 3
3 3
4 3
5 3
6 3
7 3

Try with diff then cumsum

df['run'] = df['sequence'].diff().ne(1).cumsum()
Out[349]: 
0     1
1     1
2     1
3     1
4     1
5     2
6     2
7     2
8     3
9     3
10    3
11    3
12    3
13    3
14    3
Name: sequence, dtype: int32

Use:

dataset['sequence'] = dataset.groupby('run').cumcount().add(1)

output example:

sequence run
   y      1
   a      1
   g      1
   a      2
   b      1
   a      3
   b      2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM