简体   繁体   English

根据多个列和条件对数据框进行排序

[英]Sorting dataframe based on multiple columns and conditions

I am trying to sort the following dataframe based on rolls descending first, followed by diff_vto ascending for positive values, finally by diff_vto ascending for negative values. 我正在尝试根据首先下降的rolls对以下数据帧进行排序,然后将diff_vto升为正值,最后通过diff_vto升为负值。 This is the original dataframe: 这是原始数据框:

    day  prob  vto  rolls  diff  diff_vto
0     1    10   14   27.0   0.0       -13
1     2    10   14   20.0   3.0       -12
2     3     7   14   16.0   4.0       -11
3     4     3   14   12.0  -3.0       -10
4     5     6   14   17.0   3.0        -9
5     6     3   14   14.0  -5.0        -8
6     7     8   14   14.0   5.0        -7
7     8     3   14    9.0   0.0        -6
8     9     3   14    9.0   0.0        -5
9    10     3   14   17.0   0.0        -4
10   11     3   14   22.0  -8.0        -3
11   12    11   14   27.0   3.0        -2
12   13     8   14   23.0   0.0        -1
13   14     8   14   25.0   1.0         0
14   15     7   14   27.0  -3.0         1

This is the code in case you wish to replicate it: 这是您希望复制它的代码:

    import pandas as pd 
    a = {'day':[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15],'prob':[10,10,7,3,6,3,8,3,3,3,3,11,8,8,7],'vto':[14,14,14,14,14,14,14,14,14,14,14,14,14,14,14]}
    df = pd.DataFrame(a)
    df.loc[len(df)+1] = df.loc[0] #Add an extra 2 days for rolling rolling
    df.loc[len(df)+2] = df.loc[1] #Add an extra 2 days for rolling
    df['rolls'] = df['prob'].rolling(3).sum() 
    df['rolls'] = df['rolls'].shift(periods=-2) #Displace rolls to match the index + 2
    df['diff'] = df['prob'].diff(periods=-1) #Prob[i] - Prob[i+1]
    df['diff_vto'] = df['day'] - df['vto'] 
    df = df.head(15)
    print(df)

I want to be able to sort the dataframe, based on rolls (descending) followed by the minimum value of diff_vto when it's possitive (ascending), followed by the minimum value of diff_vto when it's negative (ascending). 我希望能够对数据diff_vto进行排序,以rolls (降序)为diff_vto ,然后是diff_vto的最小值(升序),然后是diff_vto的最小值(负数)(升序)。 Based on the dataframe posted above, this would be the expected output: 根据上面发布的数据框,这将是预期的输出:

    day  prob  vto  rolls  diff  diff_vto
14   15     7   14   27.0  -3.0         1
0     1    10   14   27.0   0.0       -13
11   12    11   14   27.0   3.0        -2
13   14     8   14   25.0   1.0         0
12   13     8   14   23.0   0.0        -1
10   11     3   14   22.0  -8.0        -3
1     2    10   14   20.0   3.0       -12
4     5     6   14   17.0   3.0        -9
9    10     3   14   17.0   0.0        -4
2     3     7   14   16.0   4.0       -11
5     6     3   14   14.0  -5.0        -8
6     7     8   14   14.0   5.0        -7
3     4     3   14   12.0  -3.0       -10
7     8     3   14    9.0   0.0        -6
8     9     3   14    9.0   0.0        -5

I have obviously tried applying .sort_values() but I can't get the conditional sorting to work on diff_vto because setting it to ascending will obviously place the negative values before the positive ones. 我显然已经尝试应用.sort_values()但是我无法在diff_vto上进行条件排序,因为将其设置为升序显然会将负值放在正值之前。 Could I please get a suggestion? 我能给个建议吗? Thanks. 谢谢。

You want to sort by diff_vto>0 and abs(diff_vto) , both decreasing: 您要按diff_vto>0abs(diff_vto)进行排序,两者均递减:

df['pos'] = df['diff_vto'].gt(0)
df['abs'] = df['diff_vto'].abs()

df.sort_values(['rolls', 'pos', 'abs'], ascending=[False, False, False])

Output (you can drop pos and abs if needed): 输出(可以根据需要删除posabs ):

    day  prob  vto  rolls  diff  diff_vto    pos  abs
14   15     7   14   27.0  -3.0         1   True    1
0     1    10   14   27.0   0.0       -13  False   13
11   12    11   14   27.0   3.0        -2  False    2
13   14     8   14   25.0   1.0         0  False    0
12   13     8   14   23.0   0.0        -1  False    1
10   11     3   14   22.0  -8.0        -3  False    3
1     2    10   14   20.0   3.0       -12  False   12
4     5     6   14   17.0   3.0        -9  False    9
9    10     3   14   17.0   0.0        -4  False    4
2     3     7   14   16.0   4.0       -11  False   11
5     6     3   14   14.0  -5.0        -8  False    8
6     7     8   14   14.0   5.0        -7  False    7
3     4     3   14   12.0  -3.0       -10  False   10
7     8     3   14    9.0   0.0        -6  False    6
8     9     3   14    9.0   0.0        -5  False    5

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 基于pandas中的多个条件对DataFrame进行排序 - sorting DataFrame based on multiple conditions in pandas 根据一列的排序对多个 Pandas Dataframe 列进行排序 - Sorting multiple Pandas Dataframe Columns based on the sorting of one column Pandas dataframe - 根据多个条件计算创建多个列 - Pandas dataframe - create multiple columns based on multiple conditions calculations 根据来自另一个数据帧的多列条件创建多列 - Create multiple columns based on multiple column conditions from another dataframe 根据多个条件在 pandas dataframe 中创建多个 boolean 列 - Create multiple boolean columns in pandas dataframe based on multiple conditions 基于应用于不同列的多个逻辑条件的 Groupby DataFrame - Groupby based on a multiple logical conditions applied to a different columns DataFrame 如何根据多个条件在 Pandas 数据框中插入列? - How to insert columns in a pandas dataframe based on multiple conditions? 使用循环基于str条件构造具有多列的数据框-python - Constructing a dataframe with multiple columns based on str conditions using a loop - python 如何根据多个条件替换 2 个 dataframe 列中的值? - how to replace values in 2 dataframe columns based on multiple conditions? 根据列条件从单个 DataFrame 创建多个 DataFrame - Create multiple DataFrames from a single DataFrame based on conditions by columns
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM