简体   繁体   中英

Caret Package method = “treebag”

Here is my output from running the train function:

Bagged CART 


1251 samples
  30 predictors
   2 classes: 'N', 'Y' 


No pre-processing
Resampling: Bootstrapped (25 reps) 


Summary of sample sizes: 1247, 1247, 1247, 1247, 1247, 1247, ... 


Resampling results


  Accuracy  Kappa  Accuracy SD  Kappa SD
  0.806     0.572  0.0129       0.0263  

Here is my confusionMatrix

Bootstrapped (25 reps) Confusion Matrix 


(entries are percentages of table totals)

          Reference
Prediction    N       Y
         N    24.8   7.9
         Y    11.5  55.8

After partitioning the data set - 80% train and 20% test, I train the model, and then I do a "predict" on my test partition and get ~65% accuracy.

Questions:

(1) Does this mean my model is not very good?
(2) Is 'treebag' the proper method since I only have 2 classes: 'N', 'Y' ?  Would a Logistic Regression method be better?
(3) Finally, my 1251 samples are roughly 67% 'Y' and 33% 'N'.  Could this be "skewing" my training / results?  Do I need a ratio closer to 50 - 50?

Any help would be greatly appreciated!!

Code and a reproducible example would help here.

Assuming the confusion matrix came from running confusionMatrix.train , then I would say that your model looks pretty good. The difference in accuracy is a little puzzling. I've seen test set results look worse than the resampling results regularly but the bootstrap can be pretty pessimistic in measuring performance and here it looks much better than the test set. Try with a different training/test split and see if you get something similar (or try repeated 10-fold CV).

(a) again, hard to say with what you have posted

(b) that model is excellent and there is no general rule about which model is better or worse (google the "no free lunch" theorem)

(c) that imbalance isn't too bad so I don't think that it is an issue (unless the training and test set percentages are different)

Max

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM