简体   繁体   English

Caret 包 - glmnet 变量重要性

[英]Caret package - glmnet variable importance

I am using the glmnet package to perform a LASSO regression.我正在使用 glmnet 包来执行 LASSO 回归。 I am now working on feature importance using the caret package.我现在正在使用 caret 包处理特征重要性。 What I don't understand is the value of the importance.我不明白的是重要性的价值。 Could anyone enlighten me?有人可以启发我吗? Is there any formula to calculate these values or does that mean that these values are based on the beta values?是否有任何公式可以计算这些值,或者这是否意味着这些值基于 beta 值?

ROC curve variable importance
  only 7 most important variables shown (out of 25)
                                            Importance
feature1                             0.8974
feature2                             0.8962
feature3                              0.8957
feature4                              0.8744
feature5                              0.8701
feature6                              0.8658
feature7                             0.8253

caret actually looks at the final coefficients of the fit and then takes the absolute value to rank the coefficients. caret实际上查看拟合的最终系数,然后取绝对值对系数进行排序。 Then the ranked coefficients are stored as variable importance.然后将排序的系数存储为变量重要性。

To view the source code, you can type要查看源代码,您可以键入

getModelInfo("glmnet")$glmnet$varImp

To summarize, these are the lines to calculate it:总而言之,这些是计算它的行​​:

function(object, lambda = NULL, ...) {

  ## skipping a few lines

  beta <- predict(object, s = lambda, type = "coef")
  if(is.list(beta)) {
    out <- do.call("cbind", lapply(beta, function(x) x[,1]))
    out <- as.data.frame(out)
  } else out <- data.frame(Overall = beta[,1])
  out <- abs(out[rownames(out) != "(Intercept)",,drop = FALSE])
  out
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM