简体   繁体   中英

model$importance vs importance(model) in randomForest package

I'm a little confused about the difference of access feature importance after running random forest with randomForest package in r. Using model$importance and importance(model) give different values. Does someone know why?

Below is the example code. MeanDecreaseAccuracy have different values when I use rf$importance and importance(rf) .

rf = randomForest(Species~., data=iris, importance=T)

rf$importance

                  setosa versicolor  virginica
Sepal.Length 0.028069924 0.02290131 0.02999196
Sepal.Width  0.007430743 0.00234842 0.00802824
Petal.Length 0.340913786 0.31065484 0.30779183
Petal.Width  0.326072508 0.31167317 0.27879456
             MeanDecreaseAccuracy MeanDecreaseGini
Sepal.Length          0.026581478         9.399968
Sepal.Width           0.005823167         2.256985
Petal.Length          0.317224058        43.508494
Petal.Width           0.302483961        44.047933

importance(rf)

                setosa versicolor virginica
Sepal.Length  5.848489   7.437477  6.817425
Sepal.Width   4.584855   1.294841  4.535271
Petal.Length 22.222062  33.130557 28.586522
Petal.Width  21.634934  32.550969 30.980522
             MeanDecreaseAccuracy MeanDecreaseGini
Sepal.Length             9.820337         9.399968
Sepal.Width              5.429112         2.256985
Petal.Length            33.999215        43.508494
Petal.Width             32.807621        44.047933

Just divide each MeanDecreaseAccuracy by the corresponding value for $importanceSD

rf$importance[, 4]/ rf$importanceSD[,4]
#Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
#10.643412     4.816711    34.096432    32.764032

here you can see why, importance() scales MeanDecreaseAccuracy by its SD.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM