简体   繁体   中英

Series of correlation matrices in R

Given the following baby fragment:

d1=as.Date('April 26, 2001',format='%B %d, %Y')
d2=as.Date('April 27, 2001',format='%B %d, %Y')
d3=as.Date('April 28, 2001',format='%B %d, %Y')
tibble(DATE=c(d1,d1,d2,d2,d3,d3), Symbol=c("A","B","A","B","A","B"), voladj=c(0.2, 0.3, -0.2, -0.1, 0.3, 0.2))

resulting in

# A tibble: 6 x 3
  DATE       Symbol voladj
  <date>     <chr>   <dbl>
1 2001-04-26 A         0.2
2 2001-04-26 B         0.3
3 2001-04-27 A        -0.2
4 2001-04-27 B        -0.1
5 2001-04-28 A         0.3
6 2001-04-28 B         0.2

I try to compute a series of correlation/covariance matrices cor at time D2, cor at time D3, ... etc. Ideally data is exponentially weighted. What options do I have in R? To make things a bit more spicy a Symbol C may at some point show up, too. I was thinking to compute the outer product (rank 1 matrix) at time t1, t2, t3, and then use a simple moving mean.

A potential output could be the following:

  DATE       cov
  <date>     
1 2001-04-26  M1
2 2001-04-27  M2
3 2001-04-28  M3

where M_i are matrices (or frames), such as

M_1 =    A    B
      A  1.0  c1

      B  c1   1.0

etc. Obviously more interesting once more symbols are involved

Updated answer, given comments

Here is an approach using quantmod to retrieve 5 stocks for three weeks from Yahoo Finance. We combine the Close variable from the xts objects into a data frame, generate week identifiers with lubridate::week() , split() it by week, and calculate covariance matrices for each week using lapply() .

library(quantmod)
from.dat <- as.Date("12/03/19",format="%m/%d/%y")
to.dat <- as.Date("12/24/19",format="%m/%d/%y")

theSymbols <- c("AAPL","AXP","BA","CAT","CSCO")
getSymbols(theSymbols,from=from.dat,to=to.dat,src="yahoo")

#combine to single data frame
combinedData <- data.frame(date = as.Date(rownames(as.data.frame(AAPL))),
                           AAPL$AAPL.Close,
                           AXP$AXP.Close,
                           BA$BA.Close,
                           CAT$CAT.Close,
                           CSCO$CSCO.Close)
colnames(combinedData) <- c("date","AAPL","AXP","BA","CAT","CSCO")
# split by week
library(lubridate)
combinedData$week <- week(combinedData$date)
symbolsByWeek <- split(combinedData,as.factor(combinedData$week))
covariances <- lapply(symbolsByWeek,function(x){
        cov(x[,-c(1,7)])
})
covariances[[1]] 

...and the output:

> covariances[[1]]
           AAPL        AXP         BA        CAT        CSCO
AAPL 19.4962156  7.0959976  3.9093027  5.4158116 -0.66194433
AXP   7.0959976  3.0026695  2.0175793  2.2569625 -0.18793832
BA    3.9093027  2.0175793 10.4511473  1.8555752  0.55619975
CAT   5.4158116  2.2569625  1.8555752  1.8335361 -0.11141911
CSCO -0.6619443 -0.1879383  0.5561997 -0.1114191  0.07287982
> 

Original answer

Here is an approach using quantmod to retrieve Dow 30 data for four days from Yahoo Finance, apply() and do.call() with rbind() to massage it into a single data frame, and split() to split by day to produce daily covariance matrices.

library(quantmod)
from.dat <- as.Date("12/02/19",format="%m/%d/%y")
to.dat <- as.Date("12/06/19",format="%m/%d/%y")

theSymbols <- c("AAPL","AXP","BA","CAT","CSCO","CVX","XOM","GS","HD","IBM",
                "INTC","JNJ","KO","JPM","MCD","MMM","MRK","MSFT","NKE","PFE","PG",
                "TRV","UNH","UTX","VZ","V","WBA","WMT","DIS","DOW")
getSymbols(theSymbols,from=from.dat,to=to.dat,src="yahoo")
# since quantmod::getSymbols() writes named xts objects, need to use
# get() with the symbol names to access each data frame
# e.g. head(get(theSymbols[[1]]))
# convert to list
symbolData <- lapply(theSymbols,function(x){
     y <- as.data.frame(get(x))
     colnames(y) <- c("open","high","low","close","volume","adjusted")
     # add date and symbol name to output data frames 
     y$date <- rownames(y)
     y$symbol <- x
     y
})
#combine to single data frame
combinedData <- do.call(rbind,symbolData)
# split by day
symbolsByDay <- split(combinedData,as.factor(combinedData$date))
covariances <- lapply(symbolsByDay,function(x){
     cov(x[,1:6]) # only use first 6 columns 
})
# print first covariance matrix
covariances[1]

...and the output:

> covariances[1]
$`2019-12-02`
                  open          high           low         close        volume      adjusted
open          5956.289      5962.359      5811.514      5818.225 -9.274871e+07      5809.939
high          5962.359      5968.557      5817.580      5824.272 -9.314473e+07      5816.005
low           5811.514      5817.580      5671.809      5678.470 -9.188418e+07      5670.276
close         5818.225      5824.272      5678.470      5685.467 -9.155485e+07      5677.246
volume   -92748711.735 -93144729.578 -91884178.312 -91554853.356  4.365841e+13 -90986549.261
adjusted      5809.939      5816.005      5670.276      5677.246 -9.098655e+07      5669.171

>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM