简体   繁体   中英

How to compute AIC for linear regression model in Python?

I want to compute AIC for linear models to compare their complexity. I did it as follows:

regr = linear_model.LinearRegression()
regr.fit(X, y)

aic_intercept_slope = aic(y, regr.coef_[0] * X.as_matrix() + regr.intercept_, k=1)

def aic(y, y_pred, k):
   resid = y - y_pred.ravel()
   sse = sum(resid ** 2)

   AIC = 2*k - 2*np.log(sse)

return AIC

But I receive a divide by zero encountered in log error.

sklearn 's LinearRegression is good for prediction but pretty barebones as you've discovered. (It's often said that sklearn stays away from all things statistical inference.)

statsmodels.regression.linear_model.OLS has a property attribute AIC and a number of other pre-canned attributes.

However, note that you'll need to manually add a unit vector to your X matrix to include an intercept in your model.

from statsmodels.regression.linear_model import OLS
from statsmodels.tools import add_constant

regr = OLS(y, add_constant(X)).fit()
print(regr.aic)

Source is here if you are looking for an alternative way to write manually while still using sklearn .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM