简体   繁体   English

R中随机森林模型的进度条

[英]Progress bar in random forest model in R

I am using randomForest model in R . 我在R中使用randomForest模型。

For large numbers of trees my program takes a long time to complete . 对于大量的树木,我的程序需要很长时间才能完成。

In "randomForest" function i can use "do.trace=TRUE" to see the real time progress . 在“randomForest”函数中,我可以使用“do.trace = TRUE”来查看实时进度。 Sample out put in real time on R console is as follows 在R控制台上实时取样如下

ntree    OOB      1      2      3      4      5      6      7      8      9 
100:   2.31%  7.14%  2.08%  0.00%  2.25% 10.81%  0.90%  0.00%  0.00%  1.72% 
200:   1.95%  7.14%  2.08%  0.00%  2.25%  8.11%  0.00%  0.00%  0.00%  1.72% 
300:   1.78%  7.14%  2.08%  0.00%  1.69%  8.11%  0.00%  0.00%  0.00%  1.72% 
400:   1.95%  7.14%  2.08%  0.00%  1.69%  8.11%  0.00%  0.00%  0.00%  3.45% 
500:   1.78%  7.14%  2.08%  0.00%  1.69%  8.11%  0.00%  0.00%  0.00%  1.72% 
600:   1.78%  7.14%  2.08%  0.00%  1.69%  8.11%  0.00%  0.00%  0.00%  1.72% 
700:   1.78%  7.14%  2.08%  0.00%  1.69%  8.11%  0.00%  0.00%  0.00%  1.72% 
800:   1.78%  7.14%  2.08%  0.00%  1.69%  8.11%  0.00%  0.00%  0.00%  1.72% 
900:   1.78%  7.14%  2.08%  0.00%  1.69%  8.11%  0.00%  0.00%  0.00%  1.72% 
1000:  1.78%  7.14%  2.08%  0.00%  1.69%  8.11%  0.00%  0.00%  0.00%  1.72% 

The first row (100: 2.31% ....) comes first. 第一行(100:2.31%....)排在第一位。 After 1 second it comes 2nd row and so on. 1秒后它出现在第2排,依此类推。 I would like to modify this output . 我想修改这个输出。

When 1st row will come , I need to grab only "100" from the whole line and show only "100" on R console instead of showing the whole line. 当第一行到来时,我需要从整行中仅获取“100”并且在R控制台上仅显示“100”而不是显示整行。 Similarly for rest of the rows. 其余行也是如此。

[ I tried sink(). [我试过下沉()。 but it will not work as sink writes the complete output to output file ] 但它不会工作,因为接收sink将完整的输出写入输出文件]

[I searched for do.trace option in randomForest function. [我在randomForest函数中搜索了do.trace选项。 but I lost myself as I suspect it calls come C program; 但我迷失了自己,因为我怀疑它来自C程序; although I am not sure.] 虽然我不确定。]

I would like to grab the real time output on R console. 我想在R控制台上获取实时输出。

Note : I have seen the following issues . 注意:我看到了以下问题。

  1. https://github.com/jni/ray/issues/33 https://github.com/jni/ray/issues/33
  2. Problematic Random Forest training runtime when using formula interface 使用公式界面时有问题的随机森林训练运行时

Downloaded: https://cran.r-project.org/src/contrib/randomForest_4.6-10.tar.gz 已下载: https//cran.r-project.org/src/contrib/randomForest_4.6-10.tar.gz

When looking at the C code for refRF.C (and I suspect classRF.C which is also called with do.trace when it's a classification problem) and then following the 'jprint' flag which is what is received by the do.trace -flag in the surrounding R code, we see: 当查看refRF.C的C代码时(我怀疑classRF.C,当它是分类问题时也用do.trace调用)然后跟随'jprint'标志,这是do.trace收到的 -在周围的R代码中标记,我们看到:

/* print header for running output */
    if (*jprint <= *nTree) {
    Rprintf("     |      Out-of-bag   ");
    if (*testdat) Rprintf("|       Test set    ");
    Rprintf("|\n");
    Rprintf("Tree |      MSE  %%Var(y) ");
    if (*testdat) Rprintf("|      MSE  %%Var(y) ");
    Rprintf("|\n");
    }

And: 和:

 /* Print running output. */
    if ((j + 1) % *jprint == 0) {
        Rprintf("%4d |", j + 1);
        Rprintf(" %8.4g %8.2f ", errb, 100 * errb / varY);
        if(*labelts == 1) Rprintf("| %8.4g %8.2f ",
                                  errts, 100.0 * errts / varYts);
        Rprintf("|\n");
    }
    mse[j] = errb;
    if (*labelts) msets[j] = errts;

It should not be particularly difficult to trim that code to the point where it is only emitting the hundredth tree notification in a form you desire. 将代码修剪到仅以您希望的形式发出第100个树通知的程度应该不是特别困难。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM