简体   繁体   English

我使用statsmodel statsmodels.stats.outliers_influence.variance_inflation_factor对吗?

[英]Am i using statsmodel statsmodels.stats.outliers_influence.variance_inflation_factor right?

Currently, am detecting multicollinearity using VIF. 目前,我正在使用VIF检测多重共线性。 However, there are little to no examples online that i can use as reference, thus i tried using it by myself. 但是,网上几乎没有示例可供参考,因此我尝试自己使用它。

cat_var = df[["BsmtExposure","MSZoning","Exterior1st","MSSubClass","GarageType","GarageFinish"]].apply(preprocessing.LabelEncoder().fit_transform)
dfX = df[["OverallQual","ExterQual","GrLivArea","1stFlrSF","GarageCars","BsmtQual","HeatingQC","YearBuilt"]]
data_categorical = dfX.join(cat_var)
sm_data_categorical = sm.add_constant(data_categorical)
vifDf = sm_data_categorical
vifDf = vifDf.drop(["OverallQual","YearBuilt"],axis=1)
feature = vifDf.columns
print(feature)
vif = [variance_inflation_factor(vifDf[feature].values,feature.get_loc(var)) for var in feature]
print(vif)

Output:
[139.09182494163923, 1.9269169697717614, 1.794083234373851, 1.828696948899336, 1.6357605533337554, 1.680843256052908, 1.4734276288799137, 1.2599932369972506, 1.0704636681342352, 1.1139451723386682, 1.2658662212832537, 1.4714527943918547, 1.2728931548738207]

I have used statsmodels.add_constant too. 我也使用过statsmodels.add_constant。

Aaron, there is R package called mcTest, which does multi-Collinearity diagnostics for the variables. 亚伦,有一个名为mcTest的R包,该包对变量进行多重直线性诊断。 For more information for implementing, here is a link( http://rfaqs.com/mctest-r-package-detection-collinearity-among-regressors ). 有关实施的更多信息,请参见以下链接( http://rfaqs.com/mctest-r-package-detection-collinearity-among-regressors )。 Hope it helps. 希望能帮助到你。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 方差膨胀因子输出统计模型 - Variance inflation factor output statsmodels Panda> Statsmodel:实现variance_inflation_factor的语法错误 - Panda > Statsmodel: syntax errors implementing variance_inflation_factor Python中的方差膨胀因子 - Variance Inflation Factor in Python 我可以对分类数据应用方差膨胀因子 (VIF) 吗? - Can I apply the variance inflation factor (VIF) for classified data? 蟒蛇岭回归中的方差膨胀因子 - Variance inflation factor in ridge regression in python Statsmodel Z 测试未按预期工作 (statsmodels.stats.weightstats.CompareMeans.ztest_ind) - Statsmodel Z-test not working as intended (statsmodels.stats.weightstats.CompareMeans.ztest_ind) [Statsmodels]:如何获取statsmodel以返回OLS对象的pvalue? - [Statsmodels]: How can I get statsmodel to return the pvalue of an OLS object? 当我在statsmodels包下拟合动态因子模型时,我收到“不支持对象数组”的错误 - I am getting an error of “object arrays are not supported” when i am fitting the Dynamic Factor Model under statsmodels package 为什么我无法使用 statsmodels api 获取 VIF - Why am I not able to get the VIF using statsmodels api 我想我在使用python中的statsmodel包构建的回归模型中获得了不同的AIC和BIC值 - I think I am getting different AIC & BIC values in a regression model built using statsmodel package in Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM