I was wondering if someone knows a good way on how to sort a pandas dataframe in the following way:
a) I have the following randomly sorted data with an id that appears multiple times and a label that is either 0 or 1:
id | label ------ | ------ 1 | 1 1 | 0 1 | 0 2 | 1 2 | 0 2 | 0 3 | 0 3 | 0 3 | 0
I would like to sort the labels in ascending order and then also sort the id's in ascending order, but not grouped, so like this:
id | label ------ | ------ 1 | 0 2 | 0 3 | 0 1 | 0 2 | 0 3 | 0 3 | 0 1 | 1 2 | 1
Thanks in advance!
First sort by id and label, then use cumcount to create an index representing 1,2,3 groups, then sort on index and by labels.
df_out = df.sort_values(by=['id','label'])\
.set_index(df.groupby('id').cumcount())\
.sort_index()\
.sort_values(by='label')
Output:
id label
0 1 0
0 2 0
0 3 0
1 1 0
1 2 0
1 3 0
2 3 0
2 1 1
2 2 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.