Choosing particular rows from pandas dataframe

Question

I have performed a group by in the pandas dataframe to see how many rows are there for each location and each date.

agg_count = df.groupby(['date', 'location']).count()

Now I want to see the rows of this new dataframe that satisfy a particular condition. Say, count is greater than 50. How do I iterate over this huge dataframe efficiently to get those rows?

Answer 1

Starting with this data

In [275]: df = pd.DataFrame({'date': [20130101, 20130101, 20130102], 'location': ['a', 'a', 'c']})

In [276]: df
Out[276]:
       date location
0  20130101        a
1  20130101        a
2  20130102        c

This selects columns that have a count > 1

In [277]: df.groupby(['date', 'location']).apply(lambda sdf: sdf if len(sdf) > 1 else None)
Out[277]:
                         date location
date     location
20130101 a        0  20130101        a
                  1  20130101        a

Dropping multi-index below

In [278]: df.groupby(['date', 'location']).apply(lambda sdf: sdf if len(sdf) > 1 else None).reset_index(drop=True)
Out[278]:
       date location
0  20130101        a
1  20130101        a

Choosing particular rows from pandas dataframe

Question

1 answers

solution1
0 ACCPTED 2013-03-26 13:52:35

Choosing particular rows from pandas dataframe

Question

1 answers

solution1 0 ACCPTED 2013-03-26 13:52:35

solution1
0 ACCPTED 2013-03-26 13:52:35