简体   繁体   English

按两列排序值 Pandas Python

[英]Sorting value by two columns in Pandas Python

The idea is to sort value by two columns.这个想法是按两列对值进行排序。 Such that, given two column, I am expecting the output something like这样,给定两列,我期待 output 之类的东西

Expected output预计 output

      x     y
0   2.0   NaN
1   3.0   NaN
2   4.0   4.1
3   NaN   5.0
4  10.0   NaN
5  24.0  24.7
6  31.0  31.4

However, using the code below但是,使用下面的代码

import pandas as pd
import numpy as np
df1 = pd.DataFrame ( {'x': [2, 3, 4, 24, 31, '',10],
                      'y':['','',4.1,24.7,31.4,5,'']} )
df1.replace(r'^\s*$', np.nan, regex=True,inplace=True)
rslt_df = df1.sort_values ( by=['x', 'y'], ascending=(True, True) )

print(rslt_df)

Produce the following产生以下

      x     y
0   2.0   NaN
1   3.0   NaN
2   4.0   4.1
6  10.0   NaN
3  24.0  24.7
4  31.0  31.4
5   NaN   5.0

Notice that at the last row, the 5.0 of column y is placed at the bottom.请注意,在最后一行, y列的5.0位于底部。

May I know what modification to the code in order to obtained the intended output?我可以知道对代码进行哪些修改以获得预期的 output 吗?

Try sorting by x fillna y , then reindex from those sorted values:尝试按x fillna y排序,然后从这些排序值重新reindex

df1.reindex(df1['x'].fillna(df1['y']).sort_values().index).reset_index(drop=True)

To update the df1 variable:要更新df1变量:

df1 = (
    df1.reindex(df1['x'].fillna(df1['y']).sort_values().index)
        .reset_index(drop=True)
)

df1 : df1 :

      x     y
0   2.0   NaN
1   3.0   NaN
2   4.0   4.1
3   NaN   5.0
4  10.0   NaN
5  24.0  24.7
6  31.0  31.4

with np.sort and argsort :使用np.sortargsort

df1.iloc[np.sort(df1[['x','y']],axis=1)[:,0].argsort()]

      x     y
0   2.0   NaN
1   3.0   NaN
2   4.0   4.1
5   NaN   5.0
6  10.0   NaN
3  24.0  24.7
4  31.0  31.4

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM