[英]How can I add more trees to a random forest in R
Is there a canonical way to iteratively add trees to a random forest?是否有一种规范的方法可以将树迭代地添加到随机森林中? Let's say I am using the caret
package and I use something like假设我正在使用caret
package 并且我使用类似的东西
rf_fit <- train(y~.,data=df,method="rf",ntree = N)
for some N and then I would like to continue adding trees to it, how would I go about that?对于一些 N,然后我想继续向其中添加树,我将如何 go 呢?
You could create your own function and lapply
across ntree
:您可以创建自己的lapply
并跨ntree
:
data <- iris
fit_tree <- function(ntree){
rf_fit <- train(Species~.,data=iris,method="rf",ntree = ntree)
}
fit <- lapply(seq(100,500,100),fit_tree)
Here fit
is a list of 5 randomForests
model each fitted with the ntrees
specified in the first argument of lapply
.这里fit
是 5 个randomForests
model 的列表,每个都配有ntrees
第一个参数中指定的lapply
。 I don't know if is possible to add trees to the same model.我不知道是否可以将树添加到同一个 model。 If the model fitted with n trees don't reach the accuracy you want, you can simply re-fit the model with n +100 trees for example (but keep in mind that increasing the number of trees doesn't necessarily improve the accuracy, indeed, it could worsen performance. In the caret package
the default ntrees
is 500 as suggested by Breiman in his original paper (Breiman, 2001)).如果配备n棵树的 model 未达到您想要的精度,您可以简单地使用例如n +100 棵树重新安装 model(但请记住,增加树的数量并不一定会提高准确性,事实上,它可能会降低性能。在caret package
中,默认的ntrees
是 500,正如 Breiman 在他的原始论文 (Breiman, 2001) 中所建议的那样。
EDIT编辑
to add trees to an existing randomForests
model you can do something like this:要将树添加到现有的randomForests
model 您可以执行以下操作:
fit_tree <- function(how.many){
rf_fit <- randomForest(Species~.,data=iris)
new_fit <- grow(rf_fit,how.many)
}
p <-lapply(seq(10,100,10),fit_tree)
Here the starting ntree
is the default one (I. e. 500) and the lapply
adds 10 trees for each iteration.这里起始的ntree
是默认的(即 500), lapply
为每次迭代添加 10 棵树。 With this approach is not so helpful to tune the mtry
parameter with caret
since the best parameter values would be found only for the first call to randomForest
, but not for the updated model使用这种方法对使用caret
调整mtry
参数没有太大帮助,因为只能在第一次调用randomForest
时找到最佳参数值,而不是在更新后的 model 中找到
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.