![](/img/trans.png)
[英]Difference between varImp (caret) and importance (randomForest) for Random Forest
[英]Difference between Importance(random forest) and RandomForest$importance
我不了解随机森林模型的重要性函数(randomForest包)和重要性值之间的区别是什么:
我计算了一个简单的RF分类模型,并尝试通过以下代码查找变量的重要性:
rf_model$importance
0 1 MeanDecreaseAccuracy MeanDecreaseGini
X1 0.096886458 0.032546101 0.055488009 2472.172207
X2 0.030985037 0.025615202 0.027530078 1338.378297
X3 0.124302743 0.012551971 0.052402188 3091.891586
importance(rf_model)
0 1 MeanDecreaseAccuracy MeanDecreaseGini
X1 159.9149603 175.6265625 242.424683 2472.172207
X2 104.8273654 97.09338154 129.5084398 1338.378297
X3 157.0207876 86.93847182 216.6374153 3091.891586
在MeanDecreaseGini不变的情况下,为什么输出的前三列之间会有差异?
默认情况下,在调用importance(rf_model)
,度量将被其“标准误差”除以。 考虑以下示例:
library(randomForest)
set.seed(4543)
data(mtcars)
mtcars.rf <- randomForest(mpg ~ ., data=mtcars, ntree=1000,
keep.forest=FALSE, importance=TRUE)
mtcars.rf$importance
#output
%IncMSE IncNodePurity
cyl 7.3939431 162.38777
disp 10.0468306 257.46627
hp 7.6801388 200.22729
drat 1.0921653 65.96165
wt 9.7998328 250.94940
qsec 0.6066792 38.52055
vs 0.7048540 24.75183
am 0.6201962 17.27180
gear 0.4110634 16.33811
carb 1.0549523 27.47096
同上
importance(mtcars.rf, scale = FALSE)
%IncMSE IncNodePurity
cyl 7.3939431 162.38777
disp 10.0468306 257.46627
hp 7.6801388 200.22729
drat 1.0921653 65.96165
wt 9.7998328 250.94940
qsec 0.6066792 38.52055
vs 0.7048540 24.75183
am 0.6201962 17.27180
gear 0.4110634 16.33811
carb 1.0549523 27.47096
default:
importance(mtcars.rf)
%IncMSE IncNodePurity
cyl 15.767986 162.38777
disp 19.885128 257.46627
hp 18.177916 200.22729
drat 7.002942 65.96165
wt 18.479239 250.94940
qsec 5.022593 38.52055
vs 4.427525 24.75183
am 6.435329 17.27180
gear 3.968845 16.33811
carb 8.207903 27.47096
最后:
importance(mtcars.rf, scale = FALSE)[,1]/mtcars.rf$importanceSD
cyl disp hp drat wt qsec vs am gear carb
15.767986 19.885128 18.177916 7.002942 18.479239 5.022593 4.427525 6.435329 3.968845 8.207903
与importance(mtcars.rf)[,1]
all.equal(importance(mtcars.rf, scale = FALSE)[,1]/mtcars.rf$importanceSD,
importance(mtcars.rf)[,1])
#output
TRUE
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.