[英]Python pandas dataframe group by based on a condition
My question is simple, I have a dataframe and I groupby
the results based on a column and get the size like this: 我的问题很简单,我有一个数据框,我根据列对结果进行
groupby
,得到如下大小:
df.groupby('column').size()
Now the problem is that I only want the ones where size is greater than X . 现在的问题是我只想要尺寸大于X的那些 。 I am wondering if I can do it using a lambda function or anything similar?
我想知道我是否可以使用lambda函数或类似的东西来做到这一点? I have already tried this:
我已经尝试过了:
df.groupby('column').size() > X
and it prints out some True and False values. 它打印出一些True和False值。
The grouped result is a regular DataFrame, so just filter the results as usual: 分组结果是常规DataFrame,因此只需像往常一样过滤结果:
import pandas as pd
df = pd.DataFrame({'a': ['a', 'b', 'a', 'a', 'b', 'c', 'd']})
after = df.groupby('a').size()
>> after
a
a 3
b 2
c 1
d 1
dtype: int64
>> after[after > 2]
a
a 3
dtype: int64
试试这段代码:
df.groupby('column').filter(lambda group: group.size > X)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.