将满足条件的 dataframe 行保留到同一 dataframe 分组的每组中

Question

I have the following dataframe.我有以下 dataframe。

    c1  c2  v1  v2
0   a   a   1   2
1   a   a   2   3
2   b   a   3   1
3   b   a   4   5
5   c   d   5   0

I wish to have the following output.我希望有以下 output。

    c1  c2  v1  v2
0   a   a   2   3
1   b   a   4   5
2   c   d   5   0

The rule.规则。 First group dataframe by c1, c2.第一组 dataframe 由 c1、c2 组成。 Then into each group, keep the row with the maximun value in column v2.然后进入每个组，将具有最大值的行保留在列 v2 中。 Finally, output the original dataframe with all the rows not satisfying the previous rule dropped.最后，output 原来的 dataframe 删除了所有不满足前面规则的行。

What is the better way to obtain this result?获得此结果的更好方法是什么？ Thanks.谢谢。

Going around, I have found also this solution based on apply method四处走动，我也发现了这个基于应用方法的解决方案

Answer 1

You could use groupby-transform to generate a boolean selection mask : 您可以使用groupby-transform生成布尔选择掩码：

grouped = df.groupby(['c1', 'c2'])
mask = grouped['v2'].transform(lambda x: x == x.max()).astype(bool)
df.loc[mask].reset_index(drop=True)

yields 产量

  c1 c2  v1  v2
0  a  a   2   3
1  b  a   4   5
2  c  d   5   0

Answer 2

If you want to make sure that you get one single row per group, you can sort the values by "v2" before grouping and then just take the last row (the one with the highest v2-value).如果您想确保每组只有一行，您可以在分组之前按“v2”对值进行排序，然后只取最后一行（具有最高 v2 值的那一行）。

df = pd.DataFrame({"c1": ["a", "a", "b", "b", "c"], "c2": ["a", "a", "a", "a", "d"], "v1": [1, 2, 3, 4, 5], "v2": [2, 3, 1, 5, 0]})

df.sort_values("v2").groupby(["c1", "c2"]).last().reset_index()

result:

    c1  c2  v1  v2
0   a   a   2   3
1   b   a   4   5
2   c   d   5   0

将满足条件的 dataframe 行保留到同一 dataframe 分组的每组中

问题描述

2 个解决方案

解决方案1
1 已采纳 2015-01-13 15:33:28

解决方案2
0 2022-09-23 12:37:11

将满足条件的 dataframe 行保留到同一 dataframe 分组的每组中

问题描述

2 个解决方案

解决方案1 1 已采纳 2015-01-13 15:33:28

解决方案2 0 2022-09-23 12:37:11

解决方案1
1 已采纳 2015-01-13 15:33:28

解决方案2
0 2022-09-23 12:37:11