简体   繁体   中英

Compute rolling mean/standard deviation with different start date with rollaply

Is there a start argument to the rollaply function from the zoo package? I would like to compute the columns' standard deviation of a data frame but with a different starting date for each column.

How I compute the standard deviation by columns for a large data frame:

library(zoo)
dat <- data.frame(cbind(runif(120),runif(120)))
StDev <-rollapply(dat,12,sd,by=12,na.rm=T,by.column=TRUE, align='right',fill=c(NULL,NULL,NULL))

I would like to have the rollaply to start at different rows of the the data frame but hardcoding it like this will take a very long time:

SD1 <-rollapply(dat$X1[1:120],12,sd,by=12,na.rm=T, align='right',fill=c(NULL,NULL,NULL)) #start at the first row
SD2 <-rollapply(dat$X1[12:120],12,sd,by=12,na.rm=T, align='right',fill=c(NULL,NULL,NULL)) #start at the 12th row

StDev <-cbind(SD1,c(NA,SD2))

> StDev_desired 
       SD1          SD2
  [1,] 0.2717607        NA
  [2,] 0.2848454 0.2869931
  [3,] 0.3024353 0.3036127
  [4,] 0.1919298 0.1954726
  [5,] 0.3427318 0.3097042
  [6,] 0.3513110 0.3468135
  [7,] 0.3205552 0.3485802
  [8,] 0.2594149 0.2575002
  [9,] 0.3159097 0.3095329
  [10,] 0.2967858 0.2786670

I would like to be able to pass to the rolling function a vector with the starting rows.. I could potentially align my data set first (move up the observations in columns I want the rolling function to start later than the rest) but I would like to know if there is a neater alternative.

In stata, the -rolling- function as a start argument that does it.

Create a function which takes a column of the data.frame and one element of the vector of starting positions, performs the rollapply on the subsetted data, reverses the series and converts it to zoo. Using Map apply it to the data giving a list of zoo series. cbind on zoo series will insert NA s at the end of each so all we have left is to reverse them back and convert them to data.frame:

roll <- function(x, st) {
  zoo(rev(rollapplyr(x[st:length(x)], 12, sd, na.rm = TRUE, by = 12, fill = NULL)))
}

st <- c(1, 12)
m <- do.call(cbind, Map(roll, dat, st))
data.frame(lapply(as.list(m), rev), check.names = FALSE)

Next time please use set.seed(...) to make the code in the question reproducible.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM