简体   繁体   English

根据两列过滤一个 pandas dataframe

[英]Filter a pandas dataframe based on two columns

I am trying to filter a pandas dataframe based on two columns, so that for each value in column 1 only those rows are left where column 2 is the minimum.我正在尝试基于两列过滤 pandas dataframe ,以便对于第 1 列中的每个值,只剩下第 2 列是最小值的那些行。 I know it sounds confusing like this, so here is an example:我知道这听起来很混乱,所以这里有一个例子:

> df = pd.DataFrame([{'a':'anno1', 'ppm':1},{'a':'anno1', 'ppm':2},{'a':'anno2', 'ppm':2},{'a':'anno2', 'ppm':2}])

> df
       a  ppm
0  anno1    1
1  anno1    2
2  anno2    2
3  anno2    2

And I want rows 0,2 and 3, because for anno1 , the minimum ppm is 1 , and for anno2 the minimum ppm is 2 (keep both rows.).我想要第 0,2 和 3 行,因为对于anno1 ,最小ppm1 ,对于anno2 ,最小ppm2 (保留两行。)。 So I started with a groupby :所以我从groupby开始:

> grouped_series = df.groupby(['a']).ppm.min()
> grouped_series
a
anno1    1
anno2    2

Now I have for each value in a the minimum ppm .现在我a最小ppm中的每个值。 But how do I use this series to filter the original dataframe?但是怎么用这个系列过滤掉原来的dataframe呢? Or is there even an easier way to do this?或者有没有更简单的方法来做到这一点? I tried several variations of:我尝试了几种变体:

new_df = df.loc[ df.loc[:,'ppm']==grouped_series.loc[df.loc[:,'a']] , :]

but this gives me a ValueError: Can only compare identically-labeled Series objects但这给了我一个ValueError: Can only compare identically-labeled Series objects

Use GroupBy.transform for minimal values to Series with same size like df , so compare working nice, also for filtering in boolean indexing in loc not necessary:使用GroupBy.transform将最小值与具有相同大小的Series (如df )进行比较,因此比较工作得很好,也用于过滤boolean indexing中的loc索引不需要:

new_df = df[df['ppm'] == df.groupby('a').ppm.transform('min')]
print (new_df)
       a  ppm
0  anno1    1
2  anno2    2
3  anno2    2

Here is an alternative approach if you don't mind resetting the original index:如果您不介意重置原始索引,这是一种替代方法:

df.merge(df.groupby(['a'])['ppm'].min().reset_index(), how='inner')

Output: Output:

    a   ppm
0   anno1   1
1   anno2   2
2   anno2   2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM