简体   繁体   中英

Caret package - glmnet variable importance

I am using the glmnet package to perform a LASSO regression. I am now working on feature importance using the caret package. What I don't understand is the value of the importance. Could anyone enlighten me? Is there any formula to calculate these values or does that mean that these values are based on the beta values?

ROC curve variable importance
  only 7 most important variables shown (out of 25)
                                            Importance
feature1                             0.8974
feature2                             0.8962
feature3                              0.8957
feature4                              0.8744
feature5                              0.8701
feature6                              0.8658
feature7                             0.8253

caret actually looks at the final coefficients of the fit and then takes the absolute value to rank the coefficients. Then the ranked coefficients are stored as variable importance.

To view the source code, you can type

getModelInfo("glmnet")$glmnet$varImp

To summarize, these are the lines to calculate it:

function(object, lambda = NULL, ...) {

  ## skipping a few lines

  beta <- predict(object, s = lambda, type = "coef")
  if(is.list(beta)) {
    out <- do.call("cbind", lapply(beta, function(x) x[,1]))
    out <- as.data.frame(out)
  } else out <- data.frame(Overall = beta[,1])
  out <- abs(out[rownames(out) != "(Intercept)",,drop = FALSE])
  out
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM