[英]Sorting dataframe based on multiple columns and conditions
I am trying to sort the following dataframe based on rolls
descending first, followed by diff_vto
ascending for positive values, finally by diff_vto
ascending for negative values. 我正在尝试根据首先下降的
rolls
对以下数据帧进行排序,然后将diff_vto
升为正值,最后通过diff_vto
升为负值。 This is the original dataframe: 这是原始数据框:
day prob vto rolls diff diff_vto
0 1 10 14 27.0 0.0 -13
1 2 10 14 20.0 3.0 -12
2 3 7 14 16.0 4.0 -11
3 4 3 14 12.0 -3.0 -10
4 5 6 14 17.0 3.0 -9
5 6 3 14 14.0 -5.0 -8
6 7 8 14 14.0 5.0 -7
7 8 3 14 9.0 0.0 -6
8 9 3 14 9.0 0.0 -5
9 10 3 14 17.0 0.0 -4
10 11 3 14 22.0 -8.0 -3
11 12 11 14 27.0 3.0 -2
12 13 8 14 23.0 0.0 -1
13 14 8 14 25.0 1.0 0
14 15 7 14 27.0 -3.0 1
This is the code in case you wish to replicate it: 这是您希望复制它的代码:
import pandas as pd
a = {'day':[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15],'prob':[10,10,7,3,6,3,8,3,3,3,3,11,8,8,7],'vto':[14,14,14,14,14,14,14,14,14,14,14,14,14,14,14]}
df = pd.DataFrame(a)
df.loc[len(df)+1] = df.loc[0] #Add an extra 2 days for rolling rolling
df.loc[len(df)+2] = df.loc[1] #Add an extra 2 days for rolling
df['rolls'] = df['prob'].rolling(3).sum()
df['rolls'] = df['rolls'].shift(periods=-2) #Displace rolls to match the index + 2
df['diff'] = df['prob'].diff(periods=-1) #Prob[i] - Prob[i+1]
df['diff_vto'] = df['day'] - df['vto']
df = df.head(15)
print(df)
I want to be able to sort the dataframe, based on rolls
(descending) followed by the minimum value of diff_vto
when it's possitive (ascending), followed by the minimum value of diff_vto
when it's negative (ascending). 我希望能够对数据
diff_vto
进行排序,以rolls
(降序)为diff_vto
,然后是diff_vto
的最小值(升序),然后是diff_vto
的最小值(负数)(升序)。 Based on the dataframe posted above, this would be the expected output: 根据上面发布的数据框,这将是预期的输出:
day prob vto rolls diff diff_vto
14 15 7 14 27.0 -3.0 1
0 1 10 14 27.0 0.0 -13
11 12 11 14 27.0 3.0 -2
13 14 8 14 25.0 1.0 0
12 13 8 14 23.0 0.0 -1
10 11 3 14 22.0 -8.0 -3
1 2 10 14 20.0 3.0 -12
4 5 6 14 17.0 3.0 -9
9 10 3 14 17.0 0.0 -4
2 3 7 14 16.0 4.0 -11
5 6 3 14 14.0 -5.0 -8
6 7 8 14 14.0 5.0 -7
3 4 3 14 12.0 -3.0 -10
7 8 3 14 9.0 0.0 -6
8 9 3 14 9.0 0.0 -5
I have obviously tried applying .sort_values()
but I can't get the conditional sorting to work on diff_vto
because setting it to ascending will obviously place the negative values before the positive ones. 我显然已经尝试应用
.sort_values()
但是我无法在diff_vto
上进行条件排序,因为将其设置为升序显然会将负值放在正值之前。 Could I please get a suggestion? 我能给个建议吗? Thanks.
谢谢。
You want to sort by diff_vto>0
and abs(diff_vto)
, both decreasing: 您要按
diff_vto>0
和abs(diff_vto)
进行排序,两者均递减:
df['pos'] = df['diff_vto'].gt(0)
df['abs'] = df['diff_vto'].abs()
df.sort_values(['rolls', 'pos', 'abs'], ascending=[False, False, False])
Output (you can drop pos
and abs
if needed): 输出(可以根据需要删除
pos
和abs
):
day prob vto rolls diff diff_vto pos abs
14 15 7 14 27.0 -3.0 1 True 1
0 1 10 14 27.0 0.0 -13 False 13
11 12 11 14 27.0 3.0 -2 False 2
13 14 8 14 25.0 1.0 0 False 0
12 13 8 14 23.0 0.0 -1 False 1
10 11 3 14 22.0 -8.0 -3 False 3
1 2 10 14 20.0 3.0 -12 False 12
4 5 6 14 17.0 3.0 -9 False 9
9 10 3 14 17.0 0.0 -4 False 4
2 3 7 14 16.0 4.0 -11 False 11
5 6 3 14 14.0 -5.0 -8 False 8
6 7 8 14 14.0 5.0 -7 False 7
3 4 3 14 12.0 -3.0 -10 False 10
7 8 3 14 9.0 0.0 -6 False 6
8 9 3 14 9.0 0.0 -5 False 5
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.