简体   繁体   English

当我使用python statsmodels在OLS中添加外生变量时,为什么R-Squared会减少

[英]Why would R-Squared decrease when I add an exogenous variable in OLS using python statsmodels

If I understand the OLS model correctly, this should never be the case? 如果我正确理解OLS模型,那绝不应该这样吗?

trades['const']=1
Y = trades['ret']+trades['comms']
#X = trades[['potential', 'pVal', 'startVal', 'const']]
X = trades[['potential', 'pVal', 'startVal']]

from statsmodels.regression.linear_model import OLS
ols=OLS(Y, X)
res=ols.fit()
res.summary()

If I turn the const on, I get a rsquared of 0.22 and with it off, I get 0.43. 如果我打开常数,我得到一个0.22的平方并且关闭它,我得到0.43。 How is that even possible? 这怎么可能呢?

see the answer here Statsmodels: Calculate fitted values and R squared 请参阅此处的答案Statsmodels:计算拟合值和R平方

Rsquared follows a different definition depending on whether there is a constant in the model or not. Rsquared遵循不同的定义,取决于模型中是否存在常量。

Rsquared in a linear model with a constant is the standard definition that uses a comparison with a mean only model as reference. 具有常数的线性模型中的Rsquared是标准定义,其使用与仅平均模型的比较作为参考。 Total sum of squares is demeaned. 总平方和被贬低。

Rsquared in a linear model without a constant compares with a model that has no regressors at all, or the effect of the constant is zero. 没有常数的线性模型中的Rsquared与完全没有回归量的模型进行比较,或者常数的效果为零。 In this case the R squared calculation uses a total sum of squares that does not demean. 在这种情况下,R平方计算使用不贬值的总和平方和。

Since the definition changes if we add or drop a constant, the R squared can go either way. 由于如果我们添加或删除常量,定义会发生变化,因此R平方可以采用任何一种方式。 The actual explained sum of squares will always increase if we add additional explanatory variables, or stay unchanged if the new variable doesn't contribute anything, 如果我们添加额外的解释变量,实际解释的平方和将总是增加,或者如果新变量没有贡献任何东西则保持不变,

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为什么当 fit_intercept=False 时 Sklearn R-squared 与 statsmodels 不同? - Why is Sklearn R-squared different from that of statsmodels when fit_intercept=False? 如何使用 Python 和 Numpy 计算 r 平方? - How do I calculate r-squared using Python and Numpy? 为什么我通过 sklearn 获得的线性回归得分低,但来自 statsmodels 的 R 平方值却高? - Why am I getting low score for Linear Regression via sklearn but high R-squared value from statsmodels? scikit-learn 和 statsmodels - 哪个 R 平方是正确的? - scikit-learn & statsmodels - which R-squared is correct? 如何在 Statsmodels 中获得稳健回归 (RLM) 的 R 平方? - How to get R-squared for robust regression (RLM) in Statsmodels? 如何用 python 计算 r 平方? - How to calculate r-squared with python? 使用 statsmodels.formula.api package ols ZC1C425268E68385D1AB5074C17A94F13BDD2565DDDFCZEBEE7B43 时单个变量的重复列 - Repeated columns of a single variable when using statsmodels.formula.api package ols function in python 为什么当我使用 statsmodels 进行 OLS 和使用 scikit 进行 PooledOLS 时得到相同的结果? - Why do I get the same results when I do OLS using statsmodels and PooledOLS using scikit? 如何使用python中的statsmodels保存由OLS估计的系数的变量显着性? - How to save in a variable significance of a coefficient estimated by OLS using statsmodels in python? Python statsmodels OLS和R的lm的区别 - Difference in Python statsmodels OLS and R's lm
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM