[英]Why would R-Squared decrease when I add an exogenous variable in OLS using python statsmodels
If I understand the OLS model correctly, this should never be the case? 如果我正确理解OLS模型,那绝不应该这样吗?
trades['const']=1
Y = trades['ret']+trades['comms']
#X = trades[['potential', 'pVal', 'startVal', 'const']]
X = trades[['potential', 'pVal', 'startVal']]
from statsmodels.regression.linear_model import OLS
ols=OLS(Y, X)
res=ols.fit()
res.summary()
If I turn the const on, I get a rsquared of 0.22 and with it off, I get 0.43. 如果我打开常数,我得到一个0.22的平方并且关闭它,我得到0.43。 How is that even possible?
这怎么可能呢?
see the answer here Statsmodels: Calculate fitted values and R squared 请参阅此处的答案Statsmodels:计算拟合值和R平方
Rsquared follows a different definition depending on whether there is a constant in the model or not. Rsquared遵循不同的定义,取决于模型中是否存在常量。
Rsquared in a linear model with a constant is the standard definition that uses a comparison with a mean only model as reference. 具有常数的线性模型中的Rsquared是标准定义,其使用与仅平均模型的比较作为参考。 Total sum of squares is demeaned.
总平方和被贬低。
Rsquared in a linear model without a constant compares with a model that has no regressors at all, or the effect of the constant is zero. 没有常数的线性模型中的Rsquared与完全没有回归量的模型进行比较,或者常数的效果为零。 In this case the R squared calculation uses a total sum of squares that does not demean.
在这种情况下,R平方计算使用不贬值的总和平方和。
Since the definition changes if we add or drop a constant, the R squared can go either way. 由于如果我们添加或删除常量,定义会发生变化,因此R平方可以采用任何一种方式。 The actual explained sum of squares will always increase if we add additional explanatory variables, or stay unchanged if the new variable doesn't contribute anything,
如果我们添加额外的解释变量,实际解释的平方和将总是增加,或者如果新变量没有贡献任何东西则保持不变,
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.