简体   繁体   中英

r- sum unique values in aggregate function and use NA as 0

I have a table like:

ppp<-data.frame(client=c(1,1,1,3,3,4), 
                calldate=c('2014-08-07', NA,'2014-08-06',NA, '2014-08-08',NA),
                paydate=c('2014-08-07', '2014-08-09', NA, '2014-08-06',NA,'2014-08-06' ))

I need to get the count of calldate by each client. I tried:

my.fun<-function (x) {sum(!is.na(unique(x)))}
ppp2<-aggregate(calldate~(client+calldate) , ppp, my.fun)

I got:

> ppp2
  client calldate
      1        2
      3        1

As you can see I lost the client number 3, and I ned to have all of them, and a zero if they didn't received a call.

  client calldate
      1        2
      3        1
      3        0

How can I count the dates and if the don't have a date put a 0? I also tried:

my.fun<-function (x) {length(unique(x))}

and got the same result

I tried the following too:

my.fun<-function (x) {if (is.na(x)) {0} else {length(unique(x))}}

and I get an error:

Warning message: In if (is.na(x)) { : the condition has length > 1 and only the first element will be used

It works if you use the argument na.action = na.pass . Otherwise, aggregate will ignore the NA values.

aggregate(calldate ~ client, ppp, my.fun, na.action = na.pass)
#   client calldate
# 1      1        2
# 2      3        1
# 3      4        0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM