Pandas Groupby-如果多行超过另一行的值，则选择一列中值最高的行

Question

This operation groups my DataFrame by two columns, then returns the row with the highest value in ColumnC : 此操作将DataFrame按两列进行分组，然后返回ColumnC具有最高值的ColumnC ：

df2 = df.loc[df.groupby(['columnA', 'columnB'], sort=False)['columnC'].idxmax()]

Instead, for all rows where ColumnC > 100 within each group, I would like to take the row with the highest value in ColumnD . 相反，对于每个组中ColumnC > 100所有行，我想采用ColumnD具有最高值的ColumnD 。

How can I do this? 我怎样才能做到这一点？

Edit: 编辑：

Comment below by @Code Different is basically what I'm looking for, but I don't want to exclude groups where none of the rows have ColumnC > 100 , in these cases I want the row with the highest value in ColumnC , as in the example above. @Code Different在下面的注释基本上是我要查找的内容，但是我不想排除没有任何行的ColumnC > 100 ，在这种情况下，我希望在ColumnC具有最高值的ColumnC ，如上面的例子。

Answer 1

Usually we split the data by two part , then filter them after the condition 通常，我们将数据分为两部分，然后在条件满足后进行过滤

df=sort_values('columnD')

df1 = df[df['columnC'] > 100]].drop_duplicates(['columnA', 'columnB'],keep='last')
df2 = df.drop_duplicates(['columnA', 'columnB'],keep='last')

Yourdf=pd.concat([df1,df2]).drop_duplicates(['columnA', 'columnB'])

Pandas Groupby-如果多行超过另一行的值，则选择一列中值最高的行

问题描述

1 个解决方案

解决方案1
0 2019-09-07 00:23:33

Pandas Groupby-如果多行超过另一行的值，则选择一列中值最高的行

问题描述

1 个解决方案

解决方案1 0 2019-09-07 00:23:33

解决方案1
0 2019-09-07 00:23:33