[英]Sorting value by two columns in Pandas Python
The idea is to sort value by two columns.这个想法是按两列对值进行排序。 Such that, given two column, I am expecting the output something like
这样,给定两列,我期待 output 之类的东西
Expected output预计 output
x y
0 2.0 NaN
1 3.0 NaN
2 4.0 4.1
3 NaN 5.0
4 10.0 NaN
5 24.0 24.7
6 31.0 31.4
However, using the code below但是,使用下面的代码
import pandas as pd
import numpy as np
df1 = pd.DataFrame ( {'x': [2, 3, 4, 24, 31, '',10],
'y':['','',4.1,24.7,31.4,5,'']} )
df1.replace(r'^\s*$', np.nan, regex=True,inplace=True)
rslt_df = df1.sort_values ( by=['x', 'y'], ascending=(True, True) )
print(rslt_df)
Produce the following产生以下
x y
0 2.0 NaN
1 3.0 NaN
2 4.0 4.1
6 10.0 NaN
3 24.0 24.7
4 31.0 31.4
5 NaN 5.0
Notice that at the last row, the 5.0
of column y
is placed at the bottom.请注意,在最后一行,
y
列的5.0
位于底部。
May I know what modification to the code in order to obtained the intended output?我可以知道对代码进行哪些修改以获得预期的 output 吗?
Try sorting by x
fillna
y
, then reindex
from those sorted values:尝试按
x
fillna
y
排序,然后从这些排序值重新reindex
:
df1.reindex(df1['x'].fillna(df1['y']).sort_values().index).reset_index(drop=True)
To update the df1
variable:要更新
df1
变量:
df1 = (
df1.reindex(df1['x'].fillna(df1['y']).sort_values().index)
.reset_index(drop=True)
)
df1
: df1
:
x y
0 2.0 NaN
1 3.0 NaN
2 4.0 4.1
3 NaN 5.0
4 10.0 NaN
5 24.0 24.7
6 31.0 31.4
with np.sort
and argsort
:使用
np.sort
和argsort
:
df1.iloc[np.sort(df1[['x','y']],axis=1)[:,0].argsort()]
x y
0 2.0 NaN
1 3.0 NaN
2 4.0 4.1
5 NaN 5.0
6 10.0 NaN
3 24.0 24.7
4 31.0 31.4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.