简体   繁体   中英

Python pandas sort inter groups, not intra groups (rearrange grouped rows but maintain original row order before groupby

I want to sort groups of rows based on a column (in my example, 'Group' is the column to group and then sort the groups (maintain in-group row order). I can't sort by index because the index is purposefully out of order as a result of previous operations.

df = pd.DataFrame({
    'Group':[5,5,5,9,9,777,777,1,2,2],  
    'V1':['a','b','a',3,6,1,None,10,3,None], 
    'V2':['blah','blah','blah','dog','cat','cat','na','first','last','nada'],
    'V3':[1,2,3,4,5,5,4,3,2,1,]
})

原始df

And want it to look like this:

期望的结果


I've tried various things like

df.groupby(['Group'])['Group']).aggregate({'min grp':'min'}).sort_values(by=['min grp'], ascending=True)

If it helps, the original df was created via pd.concat(list-of-dataframes) and when I sorted them afterwards by Group it also sorted the rows within the Group based on the index, which does not work for my specific problem.

You need to use sort_values with option kind='mergesort' . From pandas docs:

kind : {‘quicksort’, ‘mergesort’, ‘heapsort’}, default ‘quicksort’
      Choice of sorting algorithm. See also ndarray.np.sort for more
      information. mergesort is the only stable algorithm. For DataFrames,
      this option is only applied when sorting on a single column or label.

A sort algorithm is called stable when two identical element with equal keys appear in the same order as they are in the input . List of stable sorts are: insertion sort, merge sort, bubble sort, tim sort, counting sort

So you need:

df = df.sort_values('Group', kind='mergesort')

When you call sort_values without kind , it is default 'quicksort' and quicksort is not stable

If I understand your question correctly, you don't want to group-by, but to sort by the values of your column Group . You can do it with pandas.sort_values()

df.sort_values('Group', inplace=True)

You can also do it this way.

df.sort_values(by=["Group"])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM