简体繁体 English

R中的线性回归-约束和回归数的变化

[英]Linear Regression in R - Constraints & Varying Number of Regressors

原文 2015-02-10 10:03:01 3 1 r/ constraints/ linear-regression

I want to do a linear regression with a varying number of regressors (sometimes 3, sometimes 15) and specific inequality constraints to some of the regressor coefficients: some shall be >= 0, others can also be negative. 我想使用不同数量的回归系数（有时为3，有时为15）进行线性回归，并对某些回归系数进行特定的不等式约束：一些系数应> = 0，另一些系数也应为负。

I have done this with optim() and constrOptim() which both refer to another user-defined function that minimizes the residuals of the regression. 我使用optim()和constrOptim()来完成此操作，它们都引用了另一个用户定义的函数，该函数可最大程度地减少回归的残差。 My problem is that this will only give me coefficients and no additional data like residuals, $R^2$, etc. 我的问题是，这只会给我系数，而不会提供其他数据，例如残差，$ R ^ 2 $等。

Is there an easy way to use lm() , nl() or any other function that would account for the inequality constraints while being able to handle a varying number of regressors? 有没有一种简单的方法可以使用lm() ， nl()或任何其他能够解决不平等约束同时又能够处理各种数量的回归函数的函数？

1 个解决方案

As a linear regression is somehow an orthogonal projection onto a vectorial space, it is not meaningfull to talk about linear regression with constrained coefficient. 由于线性回归某种程度上是正交于矢量空间的投影，因此讨论具有约束系数的线性回归是没有意义的。

In any case, once a model is fitted all the statistical quantities can always be calculated via the direct formula : if your output is y and your model predicts y_pred then R^2 = mean((y-y_pred)^2) etc. 无论如何，一旦拟合了模型，就可以始终通过直接公式计算所有统计量：如果您的输出为y并且模型预测为y_pred则R^2 = mean((y-y_pred)^2)等。

Beware that confidence on the regression is not accurate anymore since it is based on some assumptions on the error term (iid normaly distributed) 请注意，对回归的信心不再准确，因为它基于误差项的某些假设（iid正态分布）