简体   繁体   English

如何使用 R 插入符号 package 计算每个 class 的变量重要性?

[英]How to calculate variable importance of each class with R caret package?

When I output the variable importance in the model (rf), I used codes below ( rfmodel_all is my model).当我 output 变量重要性在 model (rf)时,我使用了下面的代码( rfmodel_all是我的模型)。

importance(rfmodel_all[11][[1]])
varImp(rfmodel_all)

Although I got the results below, both values of variable importance in each class were different.虽然我得到了下面的结果,但每个 class 中变量重要性的两个值都是不同的。 what did their values of each class means?每个 class 的值是什么意思?

importance(rfmodel_all[11][[1]])

                        F1         F2         F3        F4       F5
 dem5m_field2   10.2504042  6.9464506  3.1169946 13.394995 17.52028

 ah             -2.5141337 -3.9860137  3.1314217 11.585716 13.33464

varImp(rfmodel_all)

rf variable importance, rf 变量重要性,

variables are sorted by maximum importance across the classes变量按类中的最大重要性排序

                    F1     F2     F3    F4     F5
 vd             72.436 98.173 54.284 91.48 100.00

 twi            10.412  8.235 22.369 92.55  82.67

please tell me some references written about this kind of explanation if you know.如果你知道的话,请告诉我一些关于这种解释的参考资料。

Thank you.谢谢你。

The varImp() function basically uses the importance() function from randomForest() and scale them from 0-100, and rearranges them as you have already noticed. varImp() function 基本上使用 randomForest() 中的importance() randomForest()并将它们从 0-100 缩放,并重新排列它们,就像你已经注意到的那样。 To get the same result, you can just do varImp(..,scale=FALSE) , for example:要获得相同的结果,您可以执行varImp(..,scale=FALSE) ,例如:

set.seed(111)
mdl = train(Species ~ .,data=iris,
trControl=trainControl(method="cv"),importance=TRUE)

importance(mdl$finalModel)[,1:3]
               setosa versicolor virginica
Sepal.Length  5.69594   6.452202  6.661104
Sepal.Width   4.46492   1.171534  4.245839
Petal.Length 22.52265  32.843039 27.864307
Petal.Width  22.11490  33.060450 31.897033

varImp(mdl,scale=FALSE)
rf variable importance

  variables are sorted by maximum importance across the classes
             setosa versicolor virginica
Petal.Width  22.115     33.060    31.897
Petal.Length 22.523     32.843    27.864
Sepal.Length  5.696      6.452     6.661
Sepal.Width   4.465      1.172     4.246

The importance scores are basically obtaining by permutation and recalculating the change in accuracy in OOB samples.重要性分数基本上是通过排列和重新计算OOB样本中准确性的变化来获得的。 See random forest page .请参阅随机森林页面 It is a rough measure of how useful the variable in predicting each class correctly.这是对变量在正确预测每个 class 中的有用性的粗略衡量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM