简体   繁体   中英

Why am I not able to get the VIF using statsmodels api

I was looking at the following official documentation from statsmodels:

https://www.statsmodels.org/stable/generated/statsmodels.stats.outliers_influence.variance_inflation_factor.html

But when I try to run this code on a practice dataset (statsmodels.api already imported as sm)

variance_inflation_factor=sm.stats.outliers_influence.variance_inflation_factor()
vif=pd.DataFrame()
vif['VIF']=[variance_inflation_factor(X_train.values,i) for i in range(X_train.shape[1])]
vif['Predictors']=X_train.columns

I get the error message: module 'statsmodels.stats.api' has no attribute 'outliers_influence

Can anyone tell me what is the appropriate way to get this working?

variance_inflation_factor=sm.stats.outliers_influence.variance_inflation_factor() does not need to be defined by calling the function with no arguments. Instead, variance_inflation_factor is a function that takes two inputs.

import pandas as pd
import numpy as np
from statsmodels.stats.outliers_influence import variance_inflation_factor

X_train = pd.DataFrame(np.random.standard_normal((1000,5)), columns=[f"x{i}" for i
in range(5)])
vif=pd.DataFrame()
vif['VIF']=[variance_inflation_factor(X_train.values,i) for i in range(X_train.shape[1])]
vif['Predictors']=X_train.columns

print(vif)

which produces

        VIF Predictors
0  1.002882         x0
1  1.004265         x1
2  1.001945         x2
3  1.004227         x3
4  1.003989         x4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM