I need to create a new "identifier column" with unique values for each combination of values of two columns. For example, the same "identifier" should be used when ID and phase are the same (eg r1 and ph1 [but a new, unique value should be added to the column when r1 and ph2])
df
ID phase side values
r1 ph1 l 12
r1 ph1 r 34
r1 ph2 l 93
s4 ph3 l 21
s3 ph2 l 88
s3 ph2 r 54
...
I would need a new column (idx) like so:
new_df
ID phase side values idx
r1 ph1 l 12 1
r1 ph1 r 34 1
r1 ph2 l 93 2
s4 ph3 l 21 3
s3 ph2 l 88 4
s3 ph2 r 54 4
...
I've tried applying code from this question but could no achieve a way to increment the values in idx.
Any suggestion on how to accomplish this would be very welcome!
Try with groupby ngroup
+ 1, use sort=False
to ensure groups are enumerated in the order they appear in the DataFrame:
df['idx'] = df.groupby(['ID', 'phase'], sort=False).ngroup() + 1
df
:
ID phase side values idx
0 r1 ph1 l 12 1
1 r1 ph1 r 34 1
2 r1 ph2 l 93 2
3 s4 ph3 l 21 3
4 s3 ph2 l 88 4
5 s3 ph2 r 54 4
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.