在熊猫中，找到每组中最小值大于值的行

Question

I have a dataframe which looks like this:我有一个看起来像这样的数据框：

pd.DataFrame({'A': ['C1', 'C1', 'C1', 'C1', 'C2', 'C2', 'C2', 'C2', 'C3', 'C3', 'C3', 'C3'],
   ...:                    'B': [1, 4, 8, 9, 1, 3, 8, 9, 1, 4, 7, 0]})
Out[40]: 
     A  B
0   C1  1
1   C1  4
2   C1  8
3   C1  9
4   C2  1
5   C2  3
6   C2  8
7   C2  9
8   C3  1
9   C3  4
10  C3  7
11  C3  0

for each group in A, I want to find the row with the smallest value greater than 5对于A中的每个组，我想找到最小值大于5的行

My resulting dataframe should look like this:我生成的数据框应如下所示：

I have tried this but this does not give me the whole row我试过这个，但这并没有给我整行

df[df.B >= 4].groupby('A')['B'].min()

What do I need to change?我需要改变什么？

Answer 1

Use idxmin instead of min to extract the index, then use loc :使用idxmin而不是min来提取索引，然后使用loc ：

df.loc[df[df.B > 5].groupby('A')['B'].idxmin()]

Output:输出：

Alternatively, you can use sort_values followed by drop_duplicates :或者，您可以使用sort_values后跟drop_duplicates ：

df[df.B > 5].sort_values('B').drop_duplicates('A')

Output:输出：

Answer 2

Another way: Filter B greater than five.另一种方式：过滤器B大于 5。 Groupby A and find B 's min value in each group. Groupby A并在每个组中找到B的min 。

 df[df.B.gt(5)].groupby('A')['B'].min().reset_index()



  A  B
0  C1  8
1  C2  8
2  C3  7

在熊猫中，找到每组中最小值大于值的行

问题描述

2 个解决方案

解决方案1
3 已采纳 2020-11-03 21:05:15

解决方案2
0 2020-11-03 21:09:03

在熊猫中，找到每组中最小值大于值的行

问题描述

2 个解决方案

解决方案1 3 已采纳 2020-11-03 21:05:15

解决方案2 0 2020-11-03 21:09:03

解决方案1
3 已采纳 2020-11-03 21:05:15

解决方案2
0 2020-11-03 21:09:03