When I output the variable importance in the model (rf), I used codes below ( rfmodel_all
is my model).
importance(rfmodel_all[11][[1]])
varImp(rfmodel_all)
Although I got the results below, both values of variable importance in each class were different. what did their values of each class means?
importance(rfmodel_all[11][[1]])
F1 F2 F3 F4 F5
dem5m_field2 10.2504042 6.9464506 3.1169946 13.394995 17.52028
ah -2.5141337 -3.9860137 3.1314217 11.585716 13.33464
varImp(rfmodel_all)
rf variable importance,
variables are sorted by maximum importance across the classes
F1 F2 F3 F4 F5
vd 72.436 98.173 54.284 91.48 100.00
twi 10.412 8.235 22.369 92.55 82.67
please tell me some references written about this kind of explanation if you know.
Thank you.
The varImp()
function basically uses the importance()
function from randomForest()
and scale them from 0-100, and rearranges them as you have already noticed. To get the same result, you can just do varImp(..,scale=FALSE)
, for example:
set.seed(111)
mdl = train(Species ~ .,data=iris,
trControl=trainControl(method="cv"),importance=TRUE)
importance(mdl$finalModel)[,1:3]
setosa versicolor virginica
Sepal.Length 5.69594 6.452202 6.661104
Sepal.Width 4.46492 1.171534 4.245839
Petal.Length 22.52265 32.843039 27.864307
Petal.Width 22.11490 33.060450 31.897033
varImp(mdl,scale=FALSE)
rf variable importance
variables are sorted by maximum importance across the classes
setosa versicolor virginica
Petal.Width 22.115 33.060 31.897
Petal.Length 22.523 32.843 27.864
Sepal.Length 5.696 6.452 6.661
Sepal.Width 4.465 1.172 4.246
The importance scores are basically obtaining by permutation and recalculating the change in accuracy in OOB samples. See random forest page . It is a rough measure of how useful the variable in predicting each class correctly.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.