简体   繁体   中英

Turning a Random Forest into a Decision Tree - Using randomForest package in R

Is it possible to generate a decision forest whose trees are exactly the same? Please note that this is an experimental question. As far as I understand random forests have two parameters that lead to the 'randomness' compared to a single decision tree:

1) number of features randomly sampled at each node of a decision tree, and

2) number of training examples drawn to create a tree.

Intuitively, if I set these two parameters to their maximum values, then I should be avoiding the 'randomness', hence each created tree should be exactly the same. Because all the trees would exactly be the same, I should be achieving the same results regardless the number of trees in the forest or different runs (ie different seed values).

I have tested this idea using the randomForest library within R. I think the two aforementioned parameters correspond to 'mtry' and 'sampsize' respectively. I have set these values to their maximum, but unfortunately there is still some randomness left, as the OOB-error estimates vary depending on the number of trees in the forest?!

Would you please help me understand how to remove all the randomness in a random decision forest, prefarably using the arguments of the randomForest library within R?

In addition to mtry and sampsize, there's another relevant argument in randomForest(): replace. By default the sampling of data points to grow each tree is done with replacement. If you want all data points to be used in all trees, not only you need to set sampsize to the number of data points, but also set replace=FALSE.

Here's a toy example to show that you can get a forest of identical trees:

library(randomForest)

set.seed(17)

x <- matrix(sample(5, 50, replace=TRUE), 10, 5)

y <- factor(sample(2, 10, replace=TRUE))

rf1 <- randomForest(x, y, mtry=ncol(x), sampsize=nrow(x), replace=FALSE, ntree=5)

You can then use getTree(rf1, 1), etc. to check that all trees are identical.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM