I am trying to use rollmean from the package zoo in a data.table while grouping data.
It works fine when all groups have enough data:
library(data.table)
dt = data.table(x=rep(c("a","b"),10),y=rnorm(20))
dt[,.(ma=rollmean(y, k = 7, fill=NA,align="right")), by = .(x)]
But when one of the groups has too little data, it returns an error
dt2 = data.table(x=rep(c("c"),1),y=rnorm(1))
dt3=rbind(dt,dt2)
dt3[,.(ma=rollmean(y, k = 7, fill=NA,align="right")), by = .(x)]
Here's the error message:
Column 1 of result for group 3 is type 'logical' but expecting type 'double'. Column types must be consistent for each group.
It seems to happen because rollmean returns a logical (a mix of TRUE
and NA
) when it doesn't have enough data Given that my data is always positive I use the following trick to make my code run anyway
dt4=dt3[,.(ma=rollmean(y, k = 7, fill=-1,align="right")), by = .(x)]
dt4[ma==-1,ma:=NA]
dt4
Is there a proper/better way to do it?
We can use the NA_real_
instead of NA
as by default it would be NA_logical_
dt3[x == 'c', class(rollmean(y, k = 7, fill = NA, align = 'right'))]
#[1] "logical"
With NA_real_
in fill
, it would work fine
dt3[,.(ma=rollmean(y, k = 7, fill=NA_real_,align="right")), by = .(x)]
# x ma
# 1: a NA
# 2: a NA
# 3: a NA
# 4: a NA
# 5: a NA
# 6: a NA
# 7: a 0.19653855
# 8: a -0.05506344
# 9: a -0.17022022
#10: a -0.28731762
#11: b NA
#12: b NA
#13: b NA
#14: b NA
#15: b NA
#16: b NA
#17: b 0.02117906
#18: b -0.07079598
#19: b -0.05393943
#20: b 0.04511924
#21: c NA
x ma
In other groups, it is also creating NA
, but the difference is that it gets coerced to numeric NA when there are non-NA elements
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.