简体   繁体   中英

How to group by and count number of none missing values for each column in group in pandas

I have the following datadrame

user_id  var  qualified_date    loyal_date
  1       1      2017-01-17     2017-02-03
  2       1      2017-01-03     2017-01-13
  3       1      2017-01-11      NaT
  4       1       NaT            NaT
  5       1       NaT            NaT
  6       2      2017-01-15      2017-02-14
  7       2      2017-01-07      NaT
  8       2      2017-01-23      2017-02-18
  9       2      2017-01-25      NaT
  10      2      2017-01-11      2017-03-01

I need to groupby this dataframe by values in 'Var' and count the number of non missing values then for each of 'qualified_date' and 'engaged_date' columns. I can do it for each column seperately and put them manually in a dataframe, but i am looking for a groupby approach or soething similar where i can automatically come to a new DF than has values in 'var' as index and for for two columns that show count of non missing values for every group.

Like this

var  qualified_count loyal_count
 1       xx            xx
 2       xx            xx

You can use DF.GroupBy.count which includes only Non- NaN entries while counting. So, you can let var be the grouped key and then aggregate the counts respectively for the two selected columns of the DF as shown:

cols = ['qualified_date', 'loyal_date']
df.groupby('var')[cols].agg('count').add_suffix("_count").reset_index()

在此输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM