R: Volatility function that interprets NAs

Question

I am looking for help with getting a volatility function to work with my dataframe. In the function below, I'm just trying to get price daily log returns for each security (each column in my data is a different security's prices over time), and then calculate an annualized vol.

volcalc= function (x) {
  returns=log(x)-log(lag(x))
  vol=sd(returns)*sqrt(252)
  return(vol)
}

Then I run it with the function below, but it returns a 1*ncol numeric vector of only NAs.

testlag=apply(dataexample,2,volcalc)

My dataframe has NAs galore (it includes all assets over the entire time period, even if they weren't around at the time), and one clear problem is that my function is ignoring the NAs. But when I tried to add various na.rm=TRUE to the function, it did not work at all.

Below is an example dataset, where the columns x and y are different securities, with each row representing a day.

structure(list(x = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 
NA, NA), y = c(3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, NA, NA, NA, NA
)), .Names = c("x", "y"), row.names = c(NA, 12L), class = "data.frame")

My question is: How do I either incorporate the NAs in the function or get around this problem in a different way by rewriting the function? Thank you for your help!

Answer 1

An alternative is to keep your data and replace all NA's with the closest previous non-NA value by running the 'na.locf' function (last observation carried forward) from the zoo-package BEFORE you apply the 'volcalc' - function. You original function has to be changed in any case as using the 'lag'-function introduces at least one NA (with a lag of 1) as mentioned by Akrun.

df.noNA <- na.locf(df) # df: original df with NAs
apply(df.noNA, 2, volcalc) # using Akrun’s corrected volcalc function
#       x        y 
#3.155899 1.592084

Which option you prefer depends very much on the proportion of NAs in your data and what you consider the 'true' volatility as the values returned will be different.

Answer 2

We can remove the 'NA' elements with !is.na(x) , but the lag(x) will return NA as the first element, which can be removed by using na.rm=TRUE in the sd

  volcalc= function (x) {
    x <- x[!is.na(x)]
   returns=log(x)-log(lag(x))
   vol=sd(returns, na.rm=TRUE)*sqrt(252)
   return(vol)
 }

apply(dataexample, 2, volcalc)
#    x        y  
#3.012588 1.030484

R: Volatility function that interprets NAs

Question

2 answers

solution1
2 2015-05-23 10:08:10

solution2
0 ACCPTED 2015-05-22 20:16:09

R: Volatility function that interprets NAs

Question

2 answers

solution1 2 2015-05-23 10:08:10

solution2 0 ACCPTED 2015-05-22 20:16:09

solution1
2 2015-05-23 10:08:10

solution2
0 ACCPTED 2015-05-22 20:16:09