简体   繁体   English

基于AUC的R随机森林特征选择

[英]R random forest feature selection based on AUC

For binary option prediction (rise, fall) I am trying random forest in R but the importance measures and OOB are biased in my case 对于二元期权预测(上升,下降),我尝试使用R中的随机森林,但在我的情况下,重要性指标和OOB有偏差

I found this article but it is Python related. 我找到了这篇文章,但它与Python有关。

Is there an R package approach for automatic feature selection that 是否有用于自动特征选择的R包方法

  • is based on AUC 基于AUC
  • maybe allows me to define my own evaluation function (money earned is function of recall and precision rates) 也许可以让我定义自己的评估函数(赚到的钱是召回率和准确率的函数)
  • maybe allows me to specify the cross-validation approach: randomly selecting traing and test case is biased, as there are timeseries data, where test data must be later than training data 也许允许我指定交叉验证方法:随机选择训练和测试用例是有偏见的,因为有时间序列数据,其中测试数据必须晚于训练数据

I just came across this question, I found this package that might help you: 我刚遇到这个问题,我发现此软件包可能对您有帮助:

i. 一世。 It's called AUCRF, it performs feature selection in a random forest model based on optimizing AUC. 它称为AUCRF,它基于优化AUC在随机森林模型中执行特征选择。 https://cran.r-project.org/web/packages/AUCRF/AUCRF.pdf https://cran.r-project.org/web/packages/AUCRF/AUCRF.pdf

ii. II。 It does allow cross-validation of your AUC based selection AUCRFcv(x, nCV = 5, M = 20) 它确实允许对基于AUC的选择AUCRFcv(x,nCV = 5,M = 20)进行交叉验证

where nCV is number of folds, M = number of repeats. 其中nCV是折叠数,M =重复数。

iii. III。 Regarding allowing your own evaluation, it does have an option where you can specify the formula using ~ but you will have to explore that more for your specific case, since you have not provided test code. 关于允许您自己评估,它确实有一个选项,您可以在其中使用〜指定公式,但是由于您未提供测试代码,因此您将不得不针对特定情况进行更多研究。

Hope this helps! 希望这可以帮助!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM