I have a time series dataframe which looks like
2014-02-05 2014-02-06 2014-02-07 2014-02-12 2014-02-14 2014-02-17 2014-02-18 2014-02-19 ......
0.0379 -0.0008 0.0352 0.0379 0.0392 0.0173 0.0360 0.0371
I want to compute moving standard deviation for every 5th day data from this list in R. What I mean is that, I wish to select a sample in the form such that sample1[1] = 2014-02-05, 0.0379 , sample1[2] =2014-02-12, 0.0379.....and then find the std dev of this sample and then use a rolling standard deviation to move on to the next date ie sample2[1] =2014-02-06, -0.0008 , sample2[2] =2014-02-12, 0.0379 and find the standard deviation of this list and so on. Since day available is irregular, I cannot use seq(1:l, by = ). In rollapply, the function would take every consecutive numbers to compute the standard deviation. Is there a way to sample every 5th day data from this list in an efficient way, or modify the standard deviation function somehow, to make it select every 5th day data and then compute the standard deviation on the available data. Any suggestion in this regard will be highly appreciated.
Restating Question I am assuming you want to fill in missing days and then if z is the resulting series calculate the following
sd(c(z[1], z[6], z[11], z[16], z[21]))
sd(c(z[2], z[7], z[12], z[17], z[22]))
etc.
but only keeping those sd's which start at times found in sample1
.
If that is not the intent of the question please clarify with further explanation and by giving an actual example of input and output.
Answer Create a daily grid g
and merge with sample1
filling in NAs from the end giving the filled in series z
. (Note that if points have gaps greater than 4 days then we do not fill those gaps since that would involve including points more than once in the sd.) Then use rollapply
to compute the desired sd
keeping only the original times.
g <- zoo(, seq(start(sample1), end(sample1), "day"))
z <- na.locf(merge(sample1, g), fromLast = TRUE, maxgap = 4)
r <- rollapply(z, 21, function(x) sd(x[seq(1, 21, 5)]), align = "left")
r[time(sample1)]
Note The rollapply
statement could alternatively be written like this:
r <- rollapply(z, list(seq(0, length = 5, by = 5)), sd)
since the width
argument can be specified as a list containing a vector of offsets.
Update Revised again after re-reading question. Also provided alternate rollapply
expression.
Following may be useful:
xx = structure(c(0.0379, -8e-04, 0.0352, 0.0379, 0.0392, 0.0173, 0.036,
0.0371), .Names = c("2014-02-05", "2014-02-06", "2014-02-07",
"2014-02-12", "2014-02-14", "2014-02-17", "2014-02-18", "2014-02-19"
))
xx
2014-02-05 2014-02-06 2014-02-07 2014-02-12 2014-02-14 2014-02-17 2014-02-18 2014-02-19
0.0379 -0.0008 0.0352 0.0379 0.0392 0.0173 0.0360 0.0371
yy = as.numeric()
for(i in 5:length(xx)){
yy[i]= sd(xx[(i-4):i])
}
yy
[1] NA NA NA NA 0.017212408 0.017278108 0.008982038 0.009130991
For a data frame version:
ddf = structure(list(date = structure(1:8, .Label = c("2014-02-05",
"2014-02-06", "2014-02-07", "2014-02-12", "2014-02-14", "2014-02-17",
"2014-02-18", "2014-02-19"), class = "factor"), value = c(0.0379,
-8e-04, 0.0352, 0.0379, 0.0392, 0.0173, 0.036, 0.0371)), .Names = c("date",
"value"), class = "data.frame", row.names = c(NA, -8L))
ddf
date value
1 2014-02-05 0.0379
2 2014-02-06 -0.0008
3 2014-02-07 0.0352
4 2014-02-12 0.0379
5 2014-02-14 0.0392
6 2014-02-17 0.0173
7 2014-02-18 0.0360
8 2014-02-19 0.0371
ddf$rolling_sd=0
for(i in 5:nrow(ddf)){
ddf$rolling_sd[i]= sd(ddf$value[(i-4):i])
}
ddf
date value rolling_sd
1 2014-02-05 0.0379 0.000000000
2 2014-02-06 -0.0008 0.000000000
3 2014-02-07 0.0352 0.000000000
4 2014-02-12 0.0379 0.000000000
5 2014-02-14 0.0392 0.017212408
6 2014-02-17 0.0173 0.017278108
7 2014-02-18 0.0360 0.008982038
8 2014-02-19 0.0371 0.009130991
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.