简体   繁体   中英

Is LASSO regression implemented in Statsmodels?

I would love to use a linear LASSO regression within statsmodels, so to be able to use the 'formula' notation for writing the model, that would save me quite some coding time when working with many categorical variables, and their interactions. However, it seems like it is not implemented yet in stats models?

Lasso is indeed implemented in statsmodels. The documentation is given in the url below:

http://www.statsmodels.org/dev/generated/statsmodels.regression.linear_model.OLS.fit_regularized.html

To be precise, the implementation in statsmodel has both L1 and L2 regularization, with their relative weight indicated by L1_wt parameter. You should look at the formula at the bottom to make sure you are doing exactly what you want to do.

Besides the elastic net implementation, there is also a square root Lasso method implemented in statsmodels.

One can use Patsy with scikit-learn to obtain the same results one would obtain with the formula notation in statsmodels. See code below:

from patsy import dmatrices

# create dummy variables, and their interactions
y, X = dmatrices('outcome ~ C(var1)*C(var2)', df, return_type="dataframe")
# flatten y into a 1-D array so scikit-learn can understand it
y = np.ravel(y)

and I can now use any model implemented in scikit-learn with the usual notations having X as independent variables, and y as dependent one.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM