简体   繁体   中英

How to calculate variable importance of each class with R caret package?

When I output the variable importance in the model (rf), I used codes below ( rfmodel_all is my model).

importance(rfmodel_all[11][[1]])
varImp(rfmodel_all)

Although I got the results below, both values of variable importance in each class were different. what did their values of each class means?

importance(rfmodel_all[11][[1]])

                        F1         F2         F3        F4       F5
 dem5m_field2   10.2504042  6.9464506  3.1169946 13.394995 17.52028

 ah             -2.5141337 -3.9860137  3.1314217 11.585716 13.33464

varImp(rfmodel_all)

rf variable importance,

variables are sorted by maximum importance across the classes

                    F1     F2     F3    F4     F5
 vd             72.436 98.173 54.284 91.48 100.00

 twi            10.412  8.235 22.369 92.55  82.67

please tell me some references written about this kind of explanation if you know.

Thank you.

The varImp() function basically uses the importance() function from randomForest() and scale them from 0-100, and rearranges them as you have already noticed. To get the same result, you can just do varImp(..,scale=FALSE) , for example:

set.seed(111)
mdl = train(Species ~ .,data=iris,
trControl=trainControl(method="cv"),importance=TRUE)

importance(mdl$finalModel)[,1:3]
               setosa versicolor virginica
Sepal.Length  5.69594   6.452202  6.661104
Sepal.Width   4.46492   1.171534  4.245839
Petal.Length 22.52265  32.843039 27.864307
Petal.Width  22.11490  33.060450 31.897033

varImp(mdl,scale=FALSE)
rf variable importance

  variables are sorted by maximum importance across the classes
             setosa versicolor virginica
Petal.Width  22.115     33.060    31.897
Petal.Length 22.523     32.843    27.864
Sepal.Length  5.696      6.452     6.661
Sepal.Width   4.465      1.172     4.246

The importance scores are basically obtaining by permutation and recalculating the change in accuracy in OOB samples. See random forest page . It is a rough measure of how useful the variable in predicting each class correctly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM