简体   繁体   中英

Choosing particular rows from pandas dataframe

I have performed a group by in the pandas dataframe to see how many rows are there for each location and each date.

agg_count = df.groupby(['date', 'location']).count()

Now I want to see the rows of this new dataframe that satisfy a particular condition. Say, count is greater than 50. How do I iterate over this huge dataframe efficiently to get those rows?

Starting with this data

In [275]: df = pd.DataFrame({'date': [20130101, 20130101, 20130102], 'location': ['a', 'a', 'c']})

In [276]: df
Out[276]:
       date location
0  20130101        a
1  20130101        a
2  20130102        c

This selects columns that have a count > 1

In [277]: df.groupby(['date', 'location']).apply(lambda sdf: sdf if len(sdf) > 1 else None)
Out[277]:
                         date location
date     location
20130101 a        0  20130101        a
                  1  20130101        a

Dropping multi-index below

In [278]: df.groupby(['date', 'location']).apply(lambda sdf: sdf if len(sdf) > 1 else None).reset_index(drop=True)
Out[278]:
       date location
0  20130101        a
1  20130101        a

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM