简体   繁体   English

Pandas Groupby 获取多列的最大值但按顺序

[英]Pandas Groupby get max of multiple columns but in order

I have below pandas dataframe我有以下 pandas dataframe

COlA    ColB    Result  Freq
A       B       1       3000
A       C       0.2     4000
A       D       1       5000
A       E       0.3     9000
A       F       0.4     8000
B       A       0.4     1000
B       C       0.1     4000
B       D       0.1     5000
B       E       0.2     9000
B       F       0.3     8000
...

I want to groupby ColA and get max of Result and Types in order ie in such a way that it should first find the max of Result column if there are more than one max row then look at Freq and find the max there.我想按groupby ColA并按顺序获取ResultTypes的最大值,即如果有多个最大行,它应该首先找到Result列的最大值,然后查看Freq并在那里找到最大值。 I've tried using groupby().max().reset_index() but not getting the desired output我试过使用groupby().max().reset_index()但没有得到所需的 output

Expected Output预计 Output

COlA    ColB    Result  Freq
A       D       1       5000
B       A       0.4     1000
...

You can sort by Results/Freq and then groupby + first :您可以按结果/频率排序,然后按groupby + first排序:

(df.sort_values(by=['Result', 'Freq'], ascending=False)
   .groupby(['COlA'], as_index=False).first()
)

output:输出:

  COlA ColB  Result  Freq
0    A    D     1.0  5000
1    B    A     0.4  1000

NB.注意。 warning your column name is COlA (with a capital O)警告您的列名是COlA (大写 O)

def function1(dd:pd.DataFrame):
    return dd.sort_values(by=['Result','Freq'],ascending=[False,False]).head(1)

df1.groupby('COlA').apply(function1).reset_index(drop=True)


out

  COlA ColB  Result  Freq
0    A    D     1.0  5000
1    B    A     0.4  1000

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM