How is the scikit Lasso/LARS used as a regressive feature selection tool?

Question

I have about 22 data predictor variables, x_i, I want to reduced down to a certain amount in order to best describe y. Basic problem... However, I'm quite unclear how to use scikit and the linearmodel.lassoLars in order to perform this task.

From their example documentations, the code is simply something like:

alpha = 0.1
lasso = Lasso(alpha=alpha)

y_pred_lasso = lasso.fit(X_train, y_train).predict(X_test)

So it performs the regression and lasso-ing, but I'm not sure how to use y_pred_lasso in order to output what I want, ie the variables from the 22 original predictors that best describe y_train.

Answer 1

You can access the selected features with the coef_ attribute of the Lasso instance once you have called fit on it. This attribute stores the weights of each feature.

>>> lasso = Lasso(alpha=alpha).fit(X_train, y_train)
>>> lasso.coef_ != 0
array([ True,  True,  True, False, False,  True,  True,  True,  True,
        True,  True,  True,  True], dtype=bool)
>>> import numpy as np
>>> np.nonzero(lasso.coef_)
(array([ 0,  1,  2,  5,  6,  7,  8,  9, 10, 11, 12]),)

How is the scikit Lasso/LARS used as a regressive feature selection tool?

Question

1 answers

solution1
2 ACCPTED 2015-03-23 08:30:22

How is the scikit Lasso/LARS used as a regressive feature selection tool?

Question

1 answers

solution1 2 ACCPTED 2015-03-23 08:30:22

solution1
2 ACCPTED 2015-03-23 08:30:22