简体   繁体   中英

Pandas Dataframe ordering and sorting of column values

I was wondering if someone knows a good way on how to sort a pandas dataframe in the following way:

a) I have the following randomly sorted data with an id that appears multiple times and a label that is either 0 or 1:

id | label
------ | ------ 
1 | 1
1 | 0
1 | 0
2 | 1
2 | 0
2 | 0
3 | 0
3 | 0
3 | 0

I would like to sort the labels in ascending order and then also sort the id's in ascending order, but not grouped, so like this:

id | label
------ | ------ 
1 | 0
2 | 0
3 | 0
1 | 0
2 | 0
3 | 0
3 | 0
1 | 1
2 | 1

Thanks in advance!

First sort by id and label, then use cumcount to create an index representing 1,2,3 groups, then sort on index and by labels.

df_out = df.sort_values(by=['id','label'])\
  .set_index(df.groupby('id').cumcount())\
  .sort_index()\
  .sort_values(by='label')

Output:

   id  label
0   1      0
0   2      0
0   3      0
1   1      0
1   2      0
1   3      0
2   3      0
2   1      1
2   2      1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM