简体   繁体   中英

Group sum and count with two unique columns in Python

I have a dataset where I would like to groupby two column, sum and take the count of these values.

Data

source  ex  pw  role    date
aa          10  hello   q222
aa          10  hello   q222
        bb  15  ok      q422
        bb  5   no      q422
        bb  1   sure    q422
        bb  4   yes     q422

Desired

source  ex  pw  count   date
aa          20  2       q222
        bb  25  4       q422

Doing

#df.groupby(['source','date'])['pw'].agg(['count','sum'])
df.groupby(['ex','date'])['pw'].agg(['count','sum'])

However, with this, I have to now perform a concatenation to merge the two outputs. Any suggestion is appreciated

use groupby() with dropna=False + rename() :

out=(df.groupby(['source','ex','date'],dropna=False)['pw'].agg(['count','sum'])
      .reset_index().rename(columns={'sum':'pw'}))

OR

groupby() with dropna=False and aggregration with named tuples:

out=(df.groupby(['source','ex'],dropna=False)
       .agg(pw=('pw','sum'),count=('pw','count'),date=('date','first'))
       .reset_index())

output of out :

    source  ex      date    count   pw
0   aa      NaN     q222    2       20
1   NaN     bb      q422    4       25

Try groupby with new key create with fillna

out = df.groupby([df.source.fillna(df.ex),df.date]).agg({'source':'first',
                                                   'ex':'first',
                                                   'pw':'sum',
                                                   'role':'count',
                                                   'date':'first'}).reset_index(drop=True)
Out[489]: 
  source    ex  pw  role  date
0     aa  None  20     2  q222
1   None    bb  25     4  q422

Try:

>>> df.fillna('').groupby(['source','ex','date']).agg({'pw': [sum, 'count']})
                pw      
               sum count
source ex date          
       bb q422  25     4
aa        q222  20     2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM