简体   繁体   中英

Drop pandas rows for entire group based on condition

import seaborn
df = seaborn.load_dataset('flights')

I want to drop the years where the number of average passengers per year is less than 200. I tried this

df[df.groupby(['year'])['passengers'].mean() > 200] 

but get this error:

*** pandas.core.indexing.IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match).

In the correct answer, the dataframe should drop rows for these years: 1949, 1950, 1951, 1952

I think, you need to:

  • group by year ,
  • filter groups, checking whether the mean of passengers in the current group is > 300.

So the code should be:

df.groupby(['year']).filter(lambda x: x.passengers.mean() > 300)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM