[英]Pandas DataFrame to drop rows in the groupby
I have a DataFrame with three columns Date
, Advertiser
and ID.我有一个包含三列Date
、 Advertiser
和 ID 的 DataFrame 。 I grouped the data firsts to see if volumns of some Advertisers are too small (For example when count()
less than 500).我首先对数据进行分组,以查看某些广告商的数量是否太小(例如当count()
小于 500 时)。 And then I want to drop those rows in the group table.然后我想删除组表中的这些行。
df.groupby(['Date','Advertiser']).ID.count()
The result likes this:结果是这样的:
Date Advertiser
2016-01 A 50000
B 50
C 4000
D 24000
2016-02 A 6800
B 7800
C 123
2016-03 B 1111
E 8600
F 500
I want a result to be this:我希望结果是这样的:
Date Advertiser
2016-01 A 50000
C 4000
D 24000
2016-02 A 6800
B 7800
2016-03 B 1111
E 8600
Followed up question:后续问题:
How about if I want to filter out the rows in groupby in term of the total count()
in date category.如果我想根据日期类别中的总计count()
过滤掉 groupby 中的行如何。 For example, I want to count()
for a date larger than 15000. The table I want likes this:例如,我想对大于 15000 的日期进行count()
。我想要的表是这样的:
Date Advertiser
2016-01 A 50000
B 50
C 4000
D 24000
2016-02 A 6800
B 7800
C 123
You have a Series object after the groupby
, which can be filtered based on value with a chained lambda filter: groupby
之后有一个 Series 对象,可以使用链式lambda过滤器根据值对其进行过滤:
df.groupby(['Date','Advertiser']).ID.count()[lambda x: x >= 500]
#Date Advertiser
#2016-01 A 50000
# C 4000
# D 24000
#2016-02 A 6800
# B 7800
#2016-03 B 1111
# E 8600
# F 500
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.