select groups having more than x members

Question

Is there a way in pandas to select, out of a grouped dataframe, the groups with more than x members ?

something like:

grouped = df.groupby(['a', 'b'])
dupes = [g[['a', 'b', 'c', 'd']] for _, g in grouped if len(g) > 1]

I can't find a solution in the docs or on SO.

Answer 1

use filter :

grouped.filter(lambda x: len(x) > 1)

Example:

In [64]:
df = pd.DataFrame({'a':[0,0,1,2],'b':np.arange(4)})
df

Out[64]:
   a  b
0  0  0
1  0  1
2  1  2
3  2  3

In [65]:
df.groupby('a').filter(lambda x: len(x)>1)

Out[65]:
   a  b
0  0  0
1  0  1

select groups having more than x members

Question

1 answers

solution1
1 ACCPTED 2016-06-22 13:31:39

select groups having more than x members

Question

1 answers

solution1 1 ACCPTED 2016-06-22 13:31:39

solution1
1 ACCPTED 2016-06-22 13:31:39