简体   繁体   English

在R中运行回归循环的最佳方法是什么?

[英]What is the best way to run a loop of regressions in R?

Assume that I have sources of data X and Y that are indexable, say matrices. 假设我有可索引的数据X和Y的来源,比如说矩阵。 And I want to run a set of independent regressions and store the result. 我想运行一组独立的回归并存储结果。 My initial approach would be 我最初的做法是

results = matrix(nrow=nrow(X), ncol=(2))
for(i in 1:ncol(X)) {
        matrix[i,] = coefficients(lm(Y[i,] ~ X[i,])

}

But, loops are bad, so I could do it with lapply as 但是,循环很糟糕,所以我可以用lapply来做

out <- lapply(1:nrow(X), function(i) { coefficients(lm(Y[i,] ~ X[i,])) } )

Is there a better way to do this? 有一个更好的方法吗?

You are certainly overoptimizing here. 你肯定在这里过度优化。 The overhead of a loop is negligible compared to the procedure of model fitting and therefore the simple answer is - use whatever way you find to be the most understandable. 与模型拟合的过程相比,循环的开销可以忽略不计,因此简单的答案是 - 使用您发现的最容易理解的方式。 I'd go for the for-loop, but lapply is fine too. 我会去for-loop,但lapply也很好。

我用plyr做这类事情,但我同意这不是一个处理效率问题,而是你阅读和写作的舒适度。

If you just want to perform straightforward multiple linear regression, then I would recommend not using lm(). 如果您只想执行简单的多元线性回归,那么我建议不要使用lm()。 There is lsfit(), but I'm not sure it would offer than much of a speed up (I have never performed a formal comparison). 有lsfit(),但我不确定它会提供多少加速(我从未进行过正式比较)。 Instead I would recommend performing the (X'X)^{-1}X'y using qr() and qrcoef(). 相反,我建议使用qr()和qrcoef()执行(X'X)^ { - 1} X'y。 This will allow you to perform multivariate multiple linear regression; 这将允许您执行多元多元线性回归; that is, treating the response variable as a matrix instead of a vector and applying the same regression to each row of observations. 也就是说,将响应变量视为矩阵而不是向量,并对每行观察应用相同的回归。

Z # design matrix
Y # matrix of observations (each row is a vector of observations)
## Estimation via multivariate multiple linear regression                    
beta <- qr.coef(qr(Z), Y)
## Fitted values                                                             
Yhat <- Z %*% beta
## Residuals                                                                 
u <- Y - Yhat

In your example, is there a different design matrix per vector of observations? 在您的示例中,每个观测矢量是否有不同的设计矩阵? If so, you may be able to modify Z in order to still accommodate this. 如果是这样,您可以修改Z以便仍然适应这一点。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM