简体   繁体   中英

Efficient calculation of multiple rolling quantiles

From some other calculations I got a long vector with ~4500000 entries ( vec ). Now I'd like to calculate the 5th, 25th, 50th, 75th and 95th quantiles for a rolling period = 1000 , ie I'd like to get these quantiles from 1st to 1000th element in vec , then for the 2nd to 1001th element in vec , etc.

Here is some example code and how I would have solved that problem:

vec <- rnorm(4500000) #create sample data
res <- matrix(nrow=length(vec), ncol=5)
period = 1000
for (i in period:length(vec)) {
  res[i,] <- quantile(vec[(i-period+1):i], p=c(0.05, 0.25, 0.5, 0.75, 0.95))
}

(Although I used rnorm to create example data, my data is not normally distributed and the standard deviation is not constant!)

However, this implementation takes rather long. Thus, I'm looking for a more time-efficient implementation in R.

You can use the sapply function:

res <- sapply(period:(length(vec)), function(x) quantile(vec[(x-period+1):x], p=c(0.05, 0.25, 0.5, 0.75, 0.95)))
res <- t(res)

I just found the runquantile function from the caTools package. It did the job very fast.

roll_quantile from the roll package is the fastest implementation. works on matrices too.

https://search.r-project.org/CRAN/refmans/roll/html/roll_quantile.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM