简体   繁体   中英

Pandas Compute conditional count for groupby including zero counts

Is there a way in Pandas to count the number of rows containing a specific value based on a group including those groups containing no value?

For instance if I have this dataframe:

dd = pd.DataFrame({'g1':['a','b','a','b','a','b','c','c'],\
'g2':['x','x','z','y','y','z','x','z'],\
'cond':['i','i','i','j','j','j','k','k']})
dd[['g1','g2','cond']]

    g1  g2  cond
0   a   x   i
1   b   x   i
2   a   z   i
3   b   y   j
4   a   y   j
5   b   z   j
6   c   x   k
7   c   z   k

I would like to get all of the counts of j by group, including if the group does not have any rows containing 'j'. The ideal output would look like this:

a   x     0
    y     1
    z     0
b   x     0
    y     1
    z     1
c   x     0
    z     0

If I try and take the conditional slice beforehand, I get a truncated answer which I do not want.

dd.ix[dd['cond']=='j',:].groupby(['g1','g2'])['cond'].count()
g1  g2
a   y     1
b   y     1
    z     1

Additionally, I do not want a pivot table since the data frame does not contain every combination of g1 and g2.

Thanks in advance.

Just apply a function that counts the js.

>>> dd.groupby(['g1', 'g2']).cond.apply(lambda g: (g=='j').sum())
g1  g2
a   x     0
    y     1
    z     0
b   x     0
    y     1
    z     1
c   x     0
    z     0
Name: cond, dtype: int64

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM