If a function return 2 or more values, and using fill = NA
, rollapply
become much slower. Is there any ways to avoid it?
f1= function(v)c(mean(v)+ median(v)) #return vector of length 1
f2= function(v)c(mean(v), median(v)) #return vector of length 2
v = rnorm(1000)
microbenchmark(rollapplyr(v, 20, f1), rollapplyr(v,20, f1, fill=NA) )
# expr min lq mean median uq max neval
# rollapplyr(v, 20, f1) 50.84485 53.68726 57.21892 54.63793 57.78519 75.88305 100
# rollapplyr(v, 20, f1, fill = NA) 52.11355 54.69866 59.73473 56.20600 63.10546 99.96493 100
microbenchmark(rollapplyr(v, 20, f2), rollapplyr(v,20, f2, fill=NA) )
# expr min lq mean median uq max neval
# rollapplyr(v, 20, f2) 51.77687 52.29403 56.80307 53.44605 56.65524 105.6713 100
# rollapplyr(v, 20, f2, fill = NA) 69.93853 71.08953 76.48056 72.21896 80.58282 151.4455 100
The reason is to be found in the speed of using fill.na
on different types of data, as happens internally in the rollapply()
function. Your f1
returns a single vector, whereas f2
returns a matrix of two columns (well, both are zoo
objects actually, but you catch my drift).
The speed decrease for inserting the NA is not proportionate to the mere doubling of the number of elements, as this shows:
library(zoo)
library(microbenchmark)
v <- zoo(rnorm(1000))
m <- zoo(matrix(rnorm(2000), ncol=2))
ix <- seq(1000)>50
microbenchmark(na.fill(v, NA, ix), na.fill(m, NA, ix))
# Unit: microseconds
# expr min lq mean median uq max neval
# na.fill(v, NA, ix) 402.861 511.912 679.1114 659.597 754.8385 4716.46 100
# na.fill(m, NA, ix) 9746.643 10091.038 14281.5598 14057.304 17589.9670 22249.96 100
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.