I'm having trouble understanding the by class columns in the importance
function inside of randomForest.
My data set has two classes, "Current" and "Departed". To predict those classes,
I first create a random forest model:
fit <- randomForest(IsDeparted ~ ..., df_train),
Then I run the importance
function:
importance(fit)
Now I get a snippet of results like this, importance measure in four columns: "Current" "Departed" "MDA" "GiniDecrease"
Could someone explain how to interpret the first two class columns? Is it the mean decrease in accuracy of predicting one particular class after permuting values of that particular variable? And if so, does that mean I should focus on those columns rather than the MDA column when doing feature selection if I am more interested in the model's performance for one particular class?
Yes, the first two columns are for the specific classes. It is the mean decrease in accuracy scaled by their own standard errors. If you are interested in the accuracy of one class, you can look at that.
Let's use an example, where the default importance() function returns a scaled importance:
library(randomForest)
set.seed(111)
fit = randomForest(Species ~ .,data=iris,importance=TRUE)
importance(fit)
setosa versicolor virginica MeanDecreaseAccuracy
Sepal.Length 6.716993 7.4654657 7.697842 10.869088
Sepal.Width 4.581990 -0.5208697 4.224459 3.772957
Petal.Length 22.155981 33.0549839 27.892363 33.272150
Petal.Width 22.497643 31.4966353 31.589361 33.123064
MeanDecreaseGini
Sepal.Length 9.333510
Sepal.Width 2.425592
Petal.Length 43.324744
Petal.Width 44.146107
If you look at the unscaled, you can see the MDA column is roughly the average of the 3 classes, in this case because the 3 classes are balanced. If you have imbalanced class it will be different:
setosa versicolor virginica MeanDecreaseAccuracy
Sepal.Length 0.034156211 0.021093423 0.036147901 0.030810465
Sepal.Width 0.006522917 -0.001117593 0.006937731 0.004273138
Petal.Length 0.329299111 0.301621639 0.296869242 0.305569113
Petal.Width 0.335363736 0.298729184 0.279526019 0.302855284
MeanDecreaseGini
Sepal.Length 9.333510
Sepal.Width 2.425592
Petal.Length 43.324744
Petal.Width 44.146107
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.