简体   繁体   中英

How is the scikit Lasso/LARS used as a regressive feature selection tool?

I have about 22 data predictor variables, x_i, I want to reduced down to a certain amount in order to best describe y. Basic problem... However, I'm quite unclear how to use scikit and the linearmodel.lassoLars in order to perform this task.

From their example documentations, the code is simply something like:

alpha = 0.1
lasso = Lasso(alpha=alpha)

y_pred_lasso = lasso.fit(X_train, y_train).predict(X_test)

So it performs the regression and lasso-ing, but I'm not sure how to use y_pred_lasso in order to output what I want, ie the variables from the 22 original predictors that best describe y_train.

You can access the selected features with the coef_ attribute of the Lasso instance once you have called fit on it. This attribute stores the weights of each feature.

>>> lasso = Lasso(alpha=alpha).fit(X_train, y_train)
>>> lasso.coef_ != 0
array([ True,  True,  True, False, False,  True,  True,  True,  True,
        True,  True,  True,  True], dtype=bool)
>>> import numpy as np
>>> np.nonzero(lasso.coef_)
(array([ 0,  1,  2,  5,  6,  7,  8,  9, 10, 11, 12]),)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM