[英]In pandas find row per group which is smallest value greater than value
I have a dataframe which looks like this:我有一个看起来像这样的数据框:
pd.DataFrame({'A': ['C1', 'C1', 'C1', 'C1', 'C2', 'C2', 'C2', 'C2', 'C3', 'C3', 'C3', 'C3'],
...: 'B': [1, 4, 8, 9, 1, 3, 8, 9, 1, 4, 7, 0]})
Out[40]:
A B
0 C1 1
1 C1 4
2 C1 8
3 C1 9
4 C2 1
5 C2 3
6 C2 8
7 C2 9
8 C3 1
9 C3 4
10 C3 7
11 C3 0
for each group in A, I want to find the row with the smallest value greater than 5对于A中的每个组,我想找到最小值大于5的行
My resulting dataframe should look like this:我生成的数据框应如下所示:
A B
2 C1 8
6 C2 8
10 C3 7
I have tried this but this does not give me the whole row我试过这个,但这并没有给我整行
df[df.B >= 4].groupby('A')['B'].min()
What do I need to change?我需要改变什么?
Use idxmin
instead of min
to extract the index, then use loc
:使用
idxmin
而不是min
来提取索引,然后使用loc
:
df.loc[df[df.B > 5].groupby('A')['B'].idxmin()]
Output:输出:
A B
2 C1 8
6 C2 8
10 C3 7
Alternatively, you can use sort_values
followed by drop_duplicates
:或者,您可以使用
sort_values
后跟drop_duplicates
:
df[df.B > 5].sort_values('B').drop_duplicates('A')
Output:输出:
A B
10 C3 7
2 C1 8
6 C2 8
Another way: Filter B
greater than five.另一种方式:过滤器
B
大于 5。 Groupby
A
and find B
's min
value in each group. Groupby
A
并在每个组中找到B
的min
。
df[df.B.gt(5)].groupby('A')['B'].min().reset_index()
A B
0 C1 8
1 C2 8
2 C3 7
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.