简体   繁体   中英

Variance inflation factor output statsmodels

I am trying to find the multicollinearity using statsmodels, but the output of my code does not reveal the variance inflation factors but as dataframe of generator objects:

from statsmodels.stats.outliers_influence import variance_inflation_factor
variables = df[['Mileage','Year','EngineV']]
vif = pd.DataFrame()
vif['VIF'] = (variance_inflation_factor(variables.values,i) for i in range(variables.shape[1]))
vif['features'] = variables.columns

results in the output

                                                 VIF  | features
  ---------------------------------------------------------------
0 | <generator object <genexpr> at 0x0000023A9F204... | Mileage
1 | <generator object <genexpr> at 0x0000023A9F204... |    Year
2 | <generator object <genexpr> at 0x0000023A9F204... | EngineV

rather than giving the actual values. I am sure this is an easy fix but I am very new to Python and coding. Thanks

You should convert your generator to list. You can use list comprehansion or list() function.

vif['VIF'] = [variance_inflation_factor(variables.values,i) for i in range(variables.shape[1])]

or

vif['VIF'] = list((variance_inflation_factor(variables.values,i) for i in range(variables.shape[1])))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM