Random forest evaluation in R

Question

I am a newbie in R and I am trying to do my best to create my first model. I am working in a 2- classes random forest project and so far I have programmed the model as follows:

library(randomForest)

set.seed(2015)

randomforest <- randomForest(as.factor(goodkit) ~ ., data=training1, importance=TRUE,ntree=2000)

varImpPlot(randomforest)

prediction <- predict(randomforest, test,type='prob')

print(prediction)

I am not sure why I don't get the overall prediction for my model.I must be missing something in my code. I get the OOB and the prediction per case in the test set but not the overall prediction of the model.

library(pROC)

auc <-roc(test$goodkit,prediction)

print(auc)

This doesn't work at all.

I have been through the pROC manual but I cannot get to understand everything. It would be very helpful if anyone can help with the code or post a link to a good practical sample.

Answer 1

Using the ROCR package, the following code should work for calculating the AUC:

library(ROCR)
predictedROC <- prediction(prediction[,2], as.factor(test$goodkit))
as.numeric(performance(predictedROC, "auc")@y.values))

Answer 2

Your problem is that predict on a randomForest object with type='prob' returns two predictions: each column contains the probability to belong to each class (for binary prediction).

You have to decide which of these predictions to use to build the ROC curve. Fortunately for binary classification they are identical (just reversed):

auc1 <-roc(test$goodkit, prediction[,1])
print(auc1)
auc2 <-roc(test$goodkit, prediction[,2])
print(auc2)

Random forest evaluation in R

Question

2 answers

solution1
1 2015-07-28 15:36:09

solution2
0 ACCPTED 2015-07-28 18:19:03

Random forest evaluation in R

Question

2 answers

solution1 1 2015-07-28 15:36:09

solution2 0 ACCPTED 2015-07-28 18:19:03

solution1
1 2015-07-28 15:36:09

solution2
0 ACCPTED 2015-07-28 18:19:03