简体   繁体   English

R中多元时间序列的线性回归

[英]Linear regression for multivariate time series in R

As part of my data analysis, I am using linear regression analysis to check whether I can predict tomorrow's value using today's data. 作为数据分析的一部分,我正在使用线性回归分析来检查是否可以使用今天的数据预测明天的价值。

My data are about 100 time series of company returns. 我的数据是大约100个公司退货的时间序列。 Here is my code so far: 到目前为止,这是我的代码:

returns <- read.zoo("returns.csv", header=TRUE, sep=",", format="%d-%m-%y")
returns_lag <- lag(returns)
lm_univariate <- lm(returns_lag$companyA ~ returns$companyA)

This works without problems, now I wish to run a linear regression for every of the 100 companies. 这没有问题,现在我希望对100家公司中的每一个进行线性回归。 Since setting up each linear regression model manually would take too much time, I would like to use some kind of loop (or apply function) to shorten the process. 由于手动设置每个线性回归模型将花费太多时间,因此我想使用某种循环(或应用函数)来缩短过程。

My approach: 我的方法:

test <- lapply(returns_lag ~ returns, lm)

But this leads to the error "unexpected symbol in "test2" " since the tilde is not being recognized there. 但这会导致错误“ test2中出现意外符号”,因为在此处无法识别代字号。

So, basically I want to run a linear regression for every company separately. 因此,基本上,我想对每个公司分别进行线性回归。

The only question that looks similar to what I wanted is Linear regression of time series over multiple columns , however there the data seems to be stored in a matrix and the code example is quite messy compared to what I was looking for. 唯一看起来与我想要的问题相似的是多个列上的时间序列的线性回归 ,但是那里的数据似乎存储在矩阵中,并且代码示例与我想要的相比非常混乱。

Formulas are great when you know the exact name of the variables you want to include in the regression. 当您知道要包含在回归中的变量的确切名称时,公式非常有用。 When you are looping over values, they aren't so great. 当您遍历值时,它们并不是那么好。 Here's an example that uses indexing to extract the columns of interest for each iteration 这是一个使用索引为每次迭代提取感兴趣的列的示例

#sample data
x.Date <- as.Date("2003-02-01") + c(1, 3, 7, 9, 14) - 1
returns <- zoo(cbind(companya=rnorm(10), companyb=rnorm(10)), x.Date)
returns_lag <- lag(returns)

$loop over columns/companies
xx<-lapply(setNames(1:ncol(returns),names(returns)), function(i) {
    today <-returns_lag[,i]
    yesterday <-head(returns[,i], -1)
    lm(today~yesterday) 
})
xx

This will return the results for each column as a list. 这将以列表的形式返回每一列的结果。

Using the dyn package (which loads zoo) we can do this: 使用dyn包(加载Zoo),我们可以这样做:

library(dyn) 
z <- zoo(EuStockMarkets) # test data

lapply(as.list(z), function(z) dyn$lm(z ~ lag(z, -1)))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM