简体   繁体   中英

how to calculate rolling average correlation

Suppose you have N time series (xts class)

Can you suggest a way (for example an existing function) for calculate rolling average correlation (rolling = moving window)?

So you have(for example) 10 time series. First step is to calculate 60 days correlation between first and second, first and third, first and fourth, and so on ... Second step is to calculate average for that correlation value.

End of first cycle.

After you advance of one day and begin all the process (first and second step)

The results is a time series with the average correlation values.

Can anyone help to find an efficient way to do this?

Thanks!

This is structure of my data:

structure(c(0.00693323784940425, 0.00119688823384623, 0.00413204756685159, 
0.00794053366741787, -0.00885729207412611, -0.0103255273426481, 
0.00526375949374813, 0.00367934409948933, -0.000445260763187072, 
0.00533008868350748, -0.0184988649053324, -0.00141119382173205, 
0.00912118322531175, 0.00260087310961143, -0.00517324445601819, 
0.000811187375852285, -0.0116665921404522, -0.00343004480926279, 
0.0120294054221377, -0.00215680014590536, 0.0168071183163816, 
0.00708735182246834, -0.0059733229016512, -0.0158720766901048, 
0.00624903406443433, 0.0027648628898902, -0.0017585734967982, 
0.0101320524039767, -0.0135954228883302, 0.000315347989116255, 
0.00954752335550202, -0.00916386710679085, 0.0133360711487689, 
0.00791710073166163, -0.00867438967357037, -0.0137301928119018, 
0.0139960297273252, 0.0117445218692636, 0.000686577438573366, 
0.0095629144062328, -0.0095629144062328, -0.0110422101956824, 
0.00400802139753909, 0.00319489089651892, 0.00238948739738154, 
0.00396983451091115, -0.010354532975581, -0.000800961196204764, 
0.00640343703520729, 0.00530505222969291, 0.000528960604769591, 
0.00211304885806918, -0.00901145774831846, -0.00266595732411989, 
0.016839280413353, 0.010194537979594, -0.00489550968298724, -0.00340329313170784, 
-0.0102799306197494, 0.0208301415149017, -0.000578168926731237, 
-0.000355597704215782, 0.000237079185558819, 0.000829334802584292, 
5.92118897646543e-05, 0, 0.0061306105275456, 0.0018738394950697, 
0.0011129482176262, 0.00604309135743897, -0.0124473568664056, 
-0.00649453341986561, 0.0103155526698018, 0.00357355949086502, 
-0.00357355949086502, 0.00666034812253669, -0.0138460077834108, 
-0.0155041865359653, 0.00548408883420493, 0.00733525242247035, 
0.00125208697492907, -0.0128031972436093, -0.0146826767924852, 
0, 0.00593340671593001, 0.00356546338719443, 0.00643017736636065, 
-0.00365347763152091, -0.0168898372113038, 0, 0.0070456351632, 
0.00699634129248716, 0.00150630794815321, -0.0115433205305631, 
-0.014377703821594, 0, 0.0117600151966468, 0.000543625998710162, 
-0.00490330592852084, -0.0193002958123656, -0.00782564083139015, 
0, -0.00162696142802687, 0.00116238534533863, 0.001161035774218, 
-0.00325430319748232, 0.000930882077925688, 0, -0.00701927122582013, 
-0.00145843487202635, 0.00315725823897228, 0.0053204478588742, 
-0.00168980124699214, 0, 0.00622099240950913, 0.00449248477550324, 
-0.00220133862496308, -0.0167525285370109, -0.0100485946017672, 
0, 0.0138102547827188, 0.006682892429688, -0.00485585172657022, 
-0.0167194630182061, -0.0196819849217924, 0, 0.00199401860686432, 
0.00567538413259872, -0.000566091155790538, -0.00198384647748195, 
-0.00826097847094331, 0.00342661671664768), .indexCLASS = c("POSIXct", 
"POSIXt"), tclass = c("POSIXct", "POSIXt"), .indexTZ = "GMT", tzone = "GMT", class = c("xts", 
"zoo"), index = structure(c(1396310400, 1396396800, 1396483200, 
1396569600, 1396828800, 1396915200), tzone = "GMT", tclass = c("POSIXct", 
"POSIXt")), .Dim = c(6L, 22L), .Dimnames = list(NULL, c("1", 
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", 
"14", "15", "16", "17", "18", "19", "20", "21", "22")))

Suppose you have all series in the data frame called X, in first ten variables. Then:

sapply(1:(NROW(X)-59), function(U) mean(cor(X[U:(U+59), 1:10 ])))

If you don't have them in a data frame, then I think the easiest way is first to make a data frame:) - provided that your time series are all of the same length.

X <- data.frame(X1=ts1, X2=ts2, .... etc)

(edit)

To exclude diagonal 1's from the correlation matrix you might first define a function that calculates mean of all values below diagonal (or above diag, doens't make a difference):

meanLT <- function(x) mean(x[lower.tri(x)])
sapply(1:(NROW(X)-59), function(U) meanLT(cor(U:(U+59), 1:10])))

(Not tested but I think it shoudlwork)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM