I am trying to find the multicollinearity using statsmodels, but the output of my code does not reveal the variance inflation factors but as dataframe of generator objects:
from statsmodels.stats.outliers_influence import variance_inflation_factor
variables = df[['Mileage','Year','EngineV']]
vif = pd.DataFrame()
vif['VIF'] = (variance_inflation_factor(variables.values,i) for i in range(variables.shape[1]))
vif['features'] = variables.columns
results in the output
VIF | features
---------------------------------------------------------------
0 | <generator object <genexpr> at 0x0000023A9F204... | Mileage
1 | <generator object <genexpr> at 0x0000023A9F204... | Year
2 | <generator object <genexpr> at 0x0000023A9F204... | EngineV
rather than giving the actual values. I am sure this is an easy fix but I am very new to Python and coding. Thanks
You should convert your generator to list. You can use list comprehansion or list() function.
vif['VIF'] = [variance_inflation_factor(variables.values,i) for i in range(variables.shape[1])]
or
vif['VIF'] = list((variance_inflation_factor(variables.values,i) for i in range(variables.shape[1])))
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.