pandas DataFrame.groupby并应用自定义函数

Question

I have a DataFrame with many duplicates (I need Type/StrikePrice pair to be unique) like this: 我有一个包含许多重复项的DataFrame（我需要Type / StrikePrice对是唯一的），如下所示：

                   Pos  AskPrice
Type  StrikePrice
C     1500.0       10    281.6
C     1500.0       11    281.9
C     1500.0       12    281.7     <- I need this one
P     1400.0       30    1200.5
P     1400.0       31    1250.2    <- I need this one

How can I group by Type + StrikePrice and apply some logic (my own function) to decide which row from the group to choose (let's say by the most greater Pos ) 我如何按Type + StrikePrice并应用一些逻辑（我自己的函数）来决定从该组中选择哪一行（让我们说最大的Pos ）

The expected result is 预期的结果是

                   Pos  AskPrice
Type  StrikePrice
C     1500.0       12    281.7
P     1400.0       31    1250.2

Thanks a lot! 非常感谢！

Answer 1

First reset_index for unique indices, then groupby with idxmax for indices of max values per groups and select rows by loc , last set_index for MultiIndex : 首先是reset_index用于唯一索引，然后groupby用idxmax表示每个组的最大值索引，并按loc选择行，最后一个set_index用于MultiIndex ：

df = df.reset_index()
df = df.loc[df.groupby(['Type','StrikePrice'])['Pos'].idxmax()]
       .set_index(['Type','StrikePrice'])

Or use sort_values with drop_duplicates : 或者使用sort_values与drop_duplicates ：

df = (df.reset_index()
       .sort_values(['Type','StrikePrice', 'Pos'])
       .drop_duplicates(['Type','StrikePrice'], keep='last')
       .set_index(['Type','StrikePrice']))
print (df)

                  Pos  AskPrice
Type StrikePrice               
C    1500.0        12     281.7
P    1400.0        31    1250.2

But if need custom function use GroupBy.apply : 但如果需要自定义函数使用GroupBy.apply ：

def f(x):
    return x[x['Pos'] == x['Pos'].max()]

df = df.groupby(level=[0,1], group_keys=False).apply(f)
print (df)
                  Pos  AskPrice
Type StrikePrice               
C    1500.0        12     281.7
P    1400.0        31    1250.2

pandas DataFrame.groupby并应用自定义函数

问题描述

1 个解决方案

解决方案1
2 已采纳 2018-01-27 18:00:16

pandas DataFrame.groupby并应用自定义函数

问题描述

1 个解决方案

解决方案1 2 已采纳 2018-01-27 18:00:16

解决方案1
2 已采纳 2018-01-27 18:00:16