简体   繁体   中英

pandas filter group by

I would like to filter results in this query, thus having only results >1 in table, if possible in a single line of code.

import pandas as pd 
import numpy as np 
df= pd.DataFrame({'Product':['A','B', 'C','A','B','D'],
                  'Age':[28,39,21,50,35,43], 
                  'Country':['USA','India','Germany','USA','India','India']
                 })
print(df.head())
table=df.groupby(['Product','Country'])['Age'].count()
table
import pandas as pd 
import numpy as np 
df= pd.DataFrame({'Product':['A','B', 'C','A','B','D'],
                  'Age':[28,39,21,50,35,43], 
                  'Country':['USA','India','Germany','USA','India','India']
                 })

table=df.groupby(['Product','Country'])['Age'].count().reset_index(name='count')
table1 = table[table["count"]>1]
table1

you can filter the count column fe like this. You can also change this to a single line like:

table=df.groupby(['Product','Country'])['Age'].count().reset_index(name='count')[table["count"]>1]

Let's chain query method:

table = (df.groupby(['Product','Country'])['Age'].count()
           .reset_index(name='Count')
           .query('Count > 1'))
table

Output:

  Product Country  Count
0       A     USA      2
1       B   India      2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM