I'd like to apply statsmodels.stats.diagnostic.compare_j
test for linear and log-linear models. The linear model formula is
Sale_Price ~ Overall_Qual + Gr_Liv_Area + Neighborhood + MS_SubClass + Bsmt_Exposure + Roof_Matl + Misc_Feature + Overall_Cond + Year_Built + Bsmt_Full_Bath + Total_Bsmt_SF + 1.
Log-linear model formula is
np.log(Sale_Price) ~ Overall_Qual + Gr_Liv_Area + Neighborhood + MS_SubClass + Bsmt_Exposure + Roof_Matl + Misc_Feature + Overall_Cond + Year_Built + Bsmt_Full_Bath + Total_Bsmt_SF + 1
(same features, but np.log(Sale_Price)
instead of Sale_Price
).
When I run the test I get an error
ValueError: endogenous variables in models are not the same
Is it possible to compare linear and log-linear models using this method? And does it make any sense or no model is superior in this case? Because if I try a workaround
log_model.model.endog = np.exp(log_model.model.endog)
I get
ValueError: The exog in results_x and in results_z are nested. J comparison requires that models are non-nested.
I can't tell if you are using a data frame, you need to create a new column with the log Sale_Price and regress using that:
df['log_Sale_Price'] = np.log(df['Sale_Price'])
mod = smf.ols(formula='log_Sale_Price ~ Overall_Qual + Gr_Liv_Area..', data=df)
As for your second question, you should not use statsmodels.stats.diagnostic.compare_j
because the dependent variables are on different scales. This function should be implementing the J test in R , so according to the manual:
The J test statistic is simply the marginal test of the fitted values in the augmented model.
Since your predicted values from the log model would be on a different scale as the non logged, this will not work.
If I understood your question, you want to see whether log transformation of your dependent variable gives a better fit.
The primary reason for transforming the dependent variable is to ensure the residues follow more closely, a gaussian distribution. You can simply plot the residues versus the predicted values to check this relationship, for example in this example . Also you can apply the Breusch-Pagan test and check whether it improves with the log transformation.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.