[英]pandas group filter issue
I cannot for the life of me figure out why the filter method refuses to work on my dataframes in pandas. 我不能为我的生活弄清楚为什么过滤器方法拒绝在我的pandas数据帧上工作。
Here is an example showing my issue: 这是一个显示我的问题的示例:
In [99]: dff4
Out[99]: <pandas.core.groupby.DataFrameGroupBy object at 0x1143cbf90>
In [100]: dff3
Out[100]: <pandas.core.groupby.DataFrameGroupBy object at 0x11439a810>
In [101]: dff3.groups
Out[101]:
{'iphone': [85373, 85374],
'remote_api_created': [85363,
85364,
85365,
85412]}
In [102]: dff4.groups
Out[102]: {'bye': [3], 'bye bye': [4], 'hello': [0, 1, 2]}
In [103]: dff4.filter(lambda x: len(x) >2)
Out[103]:
A B
0 0 hello
1 1 hello
2 2 hello
In [104]: dff3.filter(lambda x: len(x) >2)
Out[104]:
Empty DataFrame
Columns: [source]
Index: []
Notice how filter refuses to work on dff3. 请注意过滤器拒绝在dff3上工作。
Any help appreciated. 任何帮助赞赏。
If you group by column name, you move it to index, so your dataframe becomes empty, if no other columns is present, see: 如果按列名分组,则将其移至索引,因此如果不存在其他列,则数据框将变为空,请参阅:
>>> def report(x):
... print x
... return True
>>> df
source
85363 remote_api_created
85364 remote_api_created
85365 remote_api_created
85373 iphone
85374 iphone
85412 remote_api_created
>>> df.groupby('source').filter(report)
Series([], dtype: float64)
Empty DataFrame
Columns: []
Index: [85373, 85374]
Series([], dtype: float64)
Empty DataFrame
Columns: [source]
Index: []
You can group by column values: 您可以按列值进行分组:
>>> df.groupby(df['source']).filter(lambda x: len(x)>2)
source
85363 remote_api_created
85364 remote_api_created
85365 remote_api_created
85412 remote_api_created
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.