简体   繁体   English

如何根据条件过滤出整个群体?

[英]How to filter out an entire group based on condition?

I want to remove groups which do not have any values for past year in the created_date column. 我想删除created_date列中过去一年没有任何值的组。 Here is the data: 数据如下:

+--------+----------------+-----------------------+---------------------+
| class  |     title      |      description      |    created_date     |
+--------+----------------+-----------------------+---------------------+
| ClassA | ClassA Title 1 | Class A Description 1 | 2017-06-20 21:59:07 |
| ClassA | ClassA Title 2 | Class A Description 2 | 2015-06-20 21:59:07 |
| ClassA | ClassA Title 3 | Class A Description 3 | 2014-06-20 21:59:07 |
| ClassB | ClassB Title 1 | Class A Description 1 | 2016-06-20 21:59:07 |
| ClassB | ClassB Title 2 | Class A Description 2 | 2015-06-20 21:59:07 |
| ClassB | ClassB Title 3 | Class A Description 3 | 2014-06-20 21:59:07 |
| ClassC | ClassC Title 1 | Class C Description 1 | 2017-06-20 21:59:07 |
| ClassC | ClassC Title 2 | Class C Description 2 | 2016-06-20 21:59:07 |
| ClassC | ClassC Title 3 | Class C Description 3 | 2015-06-20 21:59:07 |
+--------+----------------+-----------------------+---------------------+

If you see in the above data only group ClassB does not have any created_date for the past year. 如果您在上述数据中看到,则只有ClassB组在过去的一年中没有任何created_date I want to filter out the entire group ClassB so I end up with only 6 records. 我想过滤掉整个组ClassB所以最终只有6条记录。

I tried using filter , but not sure what to do with the grouping inside the lamda: 我尝试使用filter ,但不确定如何处理lamda内的分组:

df.groupby(["class"]).filter(lambda group: ...))

Assume your cut off date is date 假设您的截止日期是date

f = lambda df: not df[df.created_date >= date].empty
df.groupby('class').filter(f)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM