I have a random forest model that predicts a variable. This variable is not a categorical class but rather a number from 0 to 1. What is the best way to evaluate the accuracy of the generated models in this case?
Should I split the training and test parts and then simply calculate linear correlations between predicted and observed values in the test class?
Is there a more elegant solution? If so which package implements this?
You can of course split off some data as test (vs. train), but with a random forest this is generally not necessary since there is a "built-in" out-of-bag (OOB) error. Here is an example which ends with showing OOB error vs. # of trees on the "mtcars" dataset:
install.packages("randomForest")
library(randomForest)
head(mtcars)
set.seed(1)
fit <- randomForest(mpg ~ ., data = mtcars, importance = TRUE, proximity = TRUE)
print(fit)
# Look at variable importance:
importance(fit)
# OOB error vs. # of trees
plot(fit)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.