简体   繁体   中英

Rolling correlation with 'grouped by' - Error: incorrect number of dimensions

I'm trying to calculate rolling correlations with a five year window based on daily stock data. My dataframe test consists of 20 columns, with "logRet3" being located in column #17 and "logMarRet3" in #18. I want to calculate the correlation of these two return measures.

What makes it difficult is the fact that I want the rolling correlation to be grouped by my share indicator "PERMNO" in column #1. By that I mean that the rolling correlation "restarts" whenever the time-series data of a particular stock ends.

Through research I came up with the following code, using the dplyr , zoo and magrittr packages:

test <- test %>% 
  group_by(PERMNO) %>% 
  mutate(CorSecMar = zoo::rollapply(test, width = 1255, function(x) cor(x[,logRet3], x[,logMarRet3]), fill = NA, align = "right"))

However, when I run this code, I get the following error:

Error in x[,logMarRet3]: Incorrect number of dimensions

Me being a newbie, I tried adjusting the code by deleting the , :

test <- test %>% 
  group_by(PERMNO) %>% 
  mutate(CorSecMar = zoo::rollapply(test, width = 1255, function(x) cor(x[logRet3], x[logMarRet3]), fill = NA, align = "right"))

resulting in the following error (translated to English):

Error in x[logMarRet3]: Only zeros are allowed to be mixed with negative indices

Any help on how to fix these errors or alternative ways of calculating the rolling correlation by group would be greatly appreciated.

EDIT: Thanks to G. Grothendieck for pointing out some flaws in my question. I'm referring to his answer for reproducible input and will keep that in mind for further posts.

There are several problems:

  • rollapply applies to each column separately unless by.column = FALSE is used.

  • using test within group_by will not cause test to be subsetted. It will refer to the entire dataset. Use individual column names instead.

  • the column names in the code in the question must have quotes around them; otherwise, it is saying there are variables of those names containing the column names.

  • when posting to SO you need to reduce your problem to a complete reproducible example and post that. I have done it this time for you in the Note at the end.

With reference to the Note, use this code:

library(dplyr)
library(zoo)

mycor <- function(x) cor(x[, 1], x[, 2])
DF %>%
  group_by(stock) %>%
  mutate(Cor = rollapplyr(cbind(a, b), 4, mycor, by.column = FALSE, fill = NA)) %>%
  ungroup

or this code which only uses zoo. mycor is from above.

library(zoo)

n <- nrow(DF)
roll <- function(i) rollapplyr(DF[i, c("a", "b")], 4, mycor, by.column = FALSE, fill = NA)
transform(DF, Cor = ave(1:n, stock, FUN = roll))

Note

The input in reproducible form is:

DF <- data.frame(stock = rep(LETTERS[1:2], each = 6), a = 1:6, b = (1:6)^3)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM