简体   繁体   中英

Sort certain rows in Data Frames by columns

I have a dataframe that is unsorted. I want to sort columns A , B , C and D in descending order (largest to smallest) however they must remain in the denomination group. For example, it should sort denomination 100 by columns A , B , C and D so hence the row 0,1,2 changes to 0,2,1.

Index   Denomination    A   B   C   D
0        100            5   0   0   0
1        100            0   0   1   0
2        100            0   2   0   0
3        200            5   2   0   0
4        200            5   0   1   0
5        200            0   4   0   0
6        200            10  0   0   0
7        200            0   2   1   0
8        200            0   0   2   0

The order of sorting levels must be A , B , C and then D . Relabeling Index is not important. The resulting dataframe should be:

Index   Denomination    A   B   C   D
0        100            5   0   0   0
2        100            0   2   0   0
1        100            0   0   1   0
6        200            10  0   0   0
3        200            5   2   0   0
4        200            5   0   1   0
5        200            0   4   0   0
7        200            0   2   1   0
8        200            0   0   2   0

This can be done in excel by selecting the rows and then applying a custom sort but I need it to be done in python using dataframes.

This should do it:

df.sort_values(by=['Denomination', 'A', 'B', 'C', 'D'], 
               ascending=[True, False, False, False, False])
Out: 
   Denomination   A  B  C  D
0           100   5  0  0  0
2           100   0  2  0  0
1           100   0  0  1  0
6           200  10  0  0  0
3           200   5  2  0  0
4           200   5  0  1  0
5           200   0  4  0  0
7           200   0  2  1  0
8           200   0  0  2  0

Sorts by Denomination in ascending order; in case of ties, it sorts by A in descending order; in case of ties, it sorts by B in descending order and so on.

If the Denomination column shouldn't be sorted but should be left in the order of appearence for groups, you can do something like this:

df.groupby('Denomination')['Denomination'].transform(pd.Series.first_valid_index)
Out: 
0    0
1    0
2    0
3    3
4    3
5    3
6    3
7    3
8    3
Name: Denomination, dtype: int64

This returns a new column to keep track of the groups. You can add this column to the DataFrame and it can have the highest priority.

(df.assign(denomination_group = 
     df.groupby('Denomination')['Denomination'].transform(pd.Series.first_valid_index))
   .sort_values(by=['denomination_group', 'A', 'B', 'C', 'D'], 
                ascending=[True, False, False, False, False])
   .drop('denomination_group', axis=1))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM