简体   繁体   中英

R dplyr - rollmean using group by columns

I have a data frame as below :

(find image if data is not as per format)

Sample Data                     
date    id  name    loc mean    count   mean
9/6/2016    kar1    AAA 100004  0   1   
9/8/2016    kar1    AAA 100004  0   3   
9/9/2016    kar1    AAA 100004  0   4   
9/10/2016   kar1    AAA 100004  0   5   
9/11/2016   kar1    AAA 100004  0   6   
9/12/2016   kar1    AAA 100004  0   7   
9/13/2016   kar1    AAA 100004  0   8   
9/14/2016   kar1    AAA 100004  0   9   
9/7/2016    blr1    BBB 100004  0   2   

am trying to calculate((7 day rolling mean)) 3 day rolling average (previous 3 day and following 3 day) on count field based on id,name,loc but results are not as expected.

find below code:

fnrollmean <- function(x) rollmean(df$count,7,na.pad=TRUE,align="center")

rollmeandf <- df %>% group_by(id,name,loc) %>% arrange(id,name,loc) %>% mutate(funs=fnrollmean(df$count))

I get error :

Error in eval(substitute(expr), envir, enclos) : incompatible size (9), expecting 8 (the group size) or 1

If I just do :

test2 <- df %>% mutate(funs=fnrollmean(df$count))

it works but calculates by considering all disease which is wrong.

Please let me know if am missing something or any work around.

Expected results:

date    id  name    loc mean    count   mean
9/6/2016    kar1    AAA 100004  0   1   NA
9/8/2016    kar1    AAA 100004  0   3   NA
9/9/2016    kar1    AAA 100004  0   4   NA
9/10/2016   kar1    AAA 100004  0   5   4.8
9/11/2016   kar1    AAA 100004  0   6   6
9/12/2016   kar1    AAA 100004  0   7   NA
9/13/2016   kar1    AAA 100004  0   8   NA
9/14/2016   kar1    AAA 100004  0   9   NA
9/7/2016    blr1    BBB 100004  0   2   NA

图像中的样本数据

Thanks

To use mutate , you must have a window function that returns the same length vector as the vector(s) that are input to the function (or return a scalar which will be coerced to a vector of that length filled with the scalar value). The issue is that your fnrollmean does not and hence the error. Notice that the same type of error will remain even after following jdobre's comments with your posted input data because your second group (blr1, BBB, 100004) has only 1 row. Therefore, modify fnrollmean as:

library(zoo)
fnrollmean <- function (x) {
  if (length(x) < 7) {
    rep(NA,length(x)) 
  } else {
    rollmean(x,7,align="center",na.pad=TRUE)
  }
}

Note that we followed jdobre's comment to use x instead of df$count within the function. Then (again following jdobre's comment to use count instead of df$count when calling fnrollmean within mutate ):

library(dplyr)
result <- df %>% group_by(id,name,loc) %>% 
                 mutate(rollavg=fnrollmean(count))

gives:

print(result)
##Source: local data frame [9 x 7]
##Groups: id, name, loc [2]
##
##       date     id   name    loc  mean count  rollavg
##     <fctr> <fctr> <fctr>  <int> <int> <int>    <dbl>
##1  9/6/2016   kar1    AAA 100004     0     1       NA
##2  9/8/2016   kar1    AAA 100004     0     3       NA
##3  9/9/2016   kar1    AAA 100004     0     4       NA
##4 9/10/2016   kar1    AAA 100004     0     5 4.857143
##5 9/11/2016   kar1    AAA 100004     0     6 6.000000
##6 9/12/2016   kar1    AAA 100004     0     7       NA
##7 9/13/2016   kar1    AAA 100004     0     8       NA
##8 9/14/2016   kar1    AAA 100004     0     9       NA
##9  9/7/2016   blr1    BBB 100004     0     2       NA

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM