Python：删除每个组中具有最大值的行

Question

I have a pandas data frame df like this. 我有一个像这样的熊猫数据框df 。

In [1]: df
Out[1]:
      country     count
0       Japan        78
1       Japan        80
2         USA        45
3      France        34
4      France        90
5          UK        45
6          UK        34
7       China        32
8       China        87
9      Russia        20
10      Russia        67

I want to remove rows with the maximum value in each group. 我想删除每个组中具有最大值的行。 So the result should look like: 因此结果应如下所示：

      country     count
0       Japan        78
3      France        34
6          UK        34
7       China        32
9      Russia        20

My first attempt: 我的第一次尝试：

idx = df.groupby(['country'], sort=False).max()['count'].index
df_new = df.drop(list(idx))

My second attempt: 我的第二次尝试：

idx = df.groupby(['country'])['count'].transform(max).index
df_new = df.drop(list(idx))

But it didn't work. 但这没有用。 Any ideas? 有任何想法吗？

Answer 1

groupby / transform('max') groupby / transform（'max'）

You can first calculate a series of maximums by group. 您可以首先按组计算一系列最大值。 Then filter out instances where count is equal to that series. 然后筛选出计数等于该系列的实例。 Note this will also remove duplicates maximums. 请注意，这还将删除重复的最大值。

g = df.groupby(['country'])['count'].transform('max')
df = df[~(df['count'] == g)]

The series g represents maximums for each row by group. 系列g代表每一行的最大值。 Where this equals df['count'] (by index), you have a row where you have the maximum for your group. 在此等于df['count'] （按索引）的位置，您有一行在其中拥有该组的最大值。 You then use ~ for the negative condition. 然后，您将~用作否定条件。

print(df.groupby(['country'])['count'].transform('max'))

0    80
1    80
2    45
3    90
4    90
5    45
6    45
7    87
8    87
9    20
Name: count, dtype: int64

sort + drop 排序+放下

Alternatively, you can sort and drop the final occurrence: 另外，您可以排序并删除最终出现的事件：

res = df.sort_values('count')
res = res.drop(res.groupby('country').tail(1).index)

print(res)

  country  count
9  Russia     20
7   China     32
3  France     34
6      UK     34
0   Japan     78

Python：删除每个组中具有最大值的行

问题描述

1 个解决方案

解决方案1
6 已采纳 2018-07-11 15:47:58

groupby / transform('max') groupby / transform（'max'）

sort + drop 排序+放下

Python：删除每个组中具有最大值的行

问题描述

1 个解决方案

解决方案1 6 已采纳 2018-07-11 15:47:58

groupby / transform('max') groupby / transform（'max'）

sort + drop 排序+放下

解决方案1
6 已采纳 2018-07-11 15:47:58