简体   繁体   中英

R random forest feature selection based on AUC

For binary option prediction (rise, fall) I am trying random forest in R but the importance measures and OOB are biased in my case

I found this article but it is Python related.

Is there an R package approach for automatic feature selection that

  • is based on AUC
  • maybe allows me to define my own evaluation function (money earned is function of recall and precision rates)
  • maybe allows me to specify the cross-validation approach: randomly selecting traing and test case is biased, as there are timeseries data, where test data must be later than training data

I just came across this question, I found this package that might help you:

i. It's called AUCRF, it performs feature selection in a random forest model based on optimizing AUC. https://cran.r-project.org/web/packages/AUCRF/AUCRF.pdf

ii. It does allow cross-validation of your AUC based selection AUCRFcv(x, nCV = 5, M = 20)

where nCV is number of folds, M = number of repeats.

iii. Regarding allowing your own evaluation, it does have an option where you can specify the formula using ~ but you will have to explore that more for your specific case, since you have not provided test code.

Hope this helps!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM