I have a DataFrame with three columns Date
, Advertiser
and ID. I grouped the data firsts to see if volumns of some Advertisers are too small (For example when count()
less than 500). And then I want to drop those rows in the group table.
df.groupby(['Date','Advertiser']).ID.count()
The result likes this:
Date Advertiser
2016-01 A 50000
B 50
C 4000
D 24000
2016-02 A 6800
B 7800
C 123
2016-03 B 1111
E 8600
F 500
I want a result to be this:
Date Advertiser
2016-01 A 50000
C 4000
D 24000
2016-02 A 6800
B 7800
2016-03 B 1111
E 8600
Followed up question:
How about if I want to filter out the rows in groupby in term of the total count()
in date category. For example, I want to count()
for a date larger than 15000. The table I want likes this:
Date Advertiser
2016-01 A 50000
B 50
C 4000
D 24000
2016-02 A 6800
B 7800
C 123
You have a Series object after the groupby
, which can be filtered based on value with a chained lambda filter:
df.groupby(['Date','Advertiser']).ID.count()[lambda x: x >= 500]
#Date Advertiser
#2016-01 A 50000
# C 4000
# D 24000
#2016-02 A 6800
# B 7800
#2016-03 B 1111
# E 8600
# F 500
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.