简体   繁体   中英

Finding lag at which cross correlation is maximum ccf( )

I have 2 time series and I am using ccf to find the cross correlation between them. ccf(ts1, ts2) lists the cross-correlations for all time lags. How can I find the lag which results in maximum correlation without manually looking at the data?

Posting the answer http://r.789695.n4.nabble.com/ccf-function-td2288257.html

Find_Max_CCF<- function(a,b)
{
 d <- ccf(a, b, plot = FALSE)
 cor = d$acf[,,1]
 lag = d$lag[,,1]
 res = data.frame(cor,lag)
 res_max = res[which.max(res$cor),]
 return(res_max)
} 

I thought I'd redo the above function but have it find the absolute max correlation that returns the original correlation (positive or negative). I also maxed out (nearly) the number of lags.

Find_Abs_Max_CCF<- function(a,b)
{
 d <- ccf(a, b, plot = FALSE, lag.max = length(a)-5)
 cor = d$acf[,,1]
 abscor = abs(d$acf[,,1])
 lag = d$lag[,,1]
 res = data.frame(cor,lag)
 absres = data.frame(abscor,lag)
 absres_max = res[which.max(absres$abscor),]
 return(absres_max)
}

Because 3 is more than 4, I also had a stab at modifying this function, this time by implementing an idea from here :

ccfmax <- function(a, b, e=0)
{
 d <- ccf(a, b, plot = FALSE, lag.max = length(a)/2)
 cor = d$acf[,,1]
 abscor = abs(d$acf[,,1])
 lag = d$lag[,,1]
 res = data.frame(cor, lag)
 absres = data.frame(abscor, lag)
 maxcor = max(absres$abscor)
 absres_max = res[which(absres$abscor >= maxcor-maxcor*e &
                        absres$abscor <= maxcor+maxcor*e),]
 return(absres_max)
}

Essentially an "error" term is added, so that if there are several values close to the maximum, they all get returned, eg:

ayy <- jitter(cos((1:360)/5), 100)
bee <- jitter(sin((1:360)/5), 100)

ccfmax(ayy, bee, 0.02)
           cor lag
348  0.9778319  -8
349  0.9670333  -7
363 -0.9650827   7
364 -0.9763180   8

If no value for e is given it is taken to be zero, and the function behaves just like the one nvogen posted.

I've modified the original solution as well, in order to loop over the function and output the values corresponding to a character vector of indices (x):

abs.max.ccf <- function(x,a,b) {
  d <- ccf(a, b, plot=FALSE, lag.max=length(a)-5)
  cor <- d$acf[,,1]
  abscor <- abs(d$acf[,,1])
  lag <- d$lag[,,1]
  abs.cor.max <- abscor[which.max(abscor)]
  abs.cor.max.lag <- lag[which.max(abscor)]
  return(c(x, abs.cor.max, abs.cor.max.lag))
}

I removed the data.frame part within the function, as it is unnecessarily slow. To loop over each column in a data.frame and return the results to a new data.frame , I use this method:

max.ccf <- lapply(colnames(df), function(x) unlist(abs.max.ccf(x, df$y, df[x])))
max.ccf <- data.frame(do.call(rbind, max.ccf))
colnames(max.ccf) <- c('Index','Cor','Lag')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM