When I'm running random forest model over my test data I'm getting different results for the same data set + model .
Here are the results where you can see the difference over the first column:
> table((predict(rfModelsL[[1]],newdata = a)) ,a$earlyR)
FALSE TRUE
FALSE 14 7
TRUE 13 66
> table((predict(rfModelsL[[1]],newdata = a)) ,a$earlyR)
FALSE TRUE
FALSE 15 7
TRUE 12 66
Although the difference is very small, I'm trying to understand what caused that. I'm guessing that predict
has "flexible" classification threshold, although I couldn't find that in the documentation; Am I right?
Thank you in advance
I will assume that you did not refit the model here, but it is simply the predict
call that is producing these results. The answer is probably this, from ?predict.randomForest
:
Any ties are broken at random, so if this is undesirable, avoid it by using odd number ntree in randomForest()
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.