[英]Progress bar in random forest model in R
I am using randomForest model in R . 我在R中使用randomForest模型。
For large numbers of trees my program takes a long time to complete . 对于大量的树木,我的程序需要很长时间才能完成。
In "randomForest" function i can use "do.trace=TRUE" to see the real time progress . 在“randomForest”函数中,我可以使用“do.trace = TRUE”来查看实时进度。 Sample out put in real time on R console is as follows
在R控制台上实时取样如下
ntree OOB 1 2 3 4 5 6 7 8 9
100: 2.31% 7.14% 2.08% 0.00% 2.25% 10.81% 0.90% 0.00% 0.00% 1.72%
200: 1.95% 7.14% 2.08% 0.00% 2.25% 8.11% 0.00% 0.00% 0.00% 1.72%
300: 1.78% 7.14% 2.08% 0.00% 1.69% 8.11% 0.00% 0.00% 0.00% 1.72%
400: 1.95% 7.14% 2.08% 0.00% 1.69% 8.11% 0.00% 0.00% 0.00% 3.45%
500: 1.78% 7.14% 2.08% 0.00% 1.69% 8.11% 0.00% 0.00% 0.00% 1.72%
600: 1.78% 7.14% 2.08% 0.00% 1.69% 8.11% 0.00% 0.00% 0.00% 1.72%
700: 1.78% 7.14% 2.08% 0.00% 1.69% 8.11% 0.00% 0.00% 0.00% 1.72%
800: 1.78% 7.14% 2.08% 0.00% 1.69% 8.11% 0.00% 0.00% 0.00% 1.72%
900: 1.78% 7.14% 2.08% 0.00% 1.69% 8.11% 0.00% 0.00% 0.00% 1.72%
1000: 1.78% 7.14% 2.08% 0.00% 1.69% 8.11% 0.00% 0.00% 0.00% 1.72%
The first row (100: 2.31% ....) comes first. 第一行(100:2.31%....)排在第一位。 After 1 second it comes 2nd row and so on.
1秒后它出现在第2排,依此类推。 I would like to modify this output .
我想修改这个输出。
When 1st row will come , I need to grab only "100" from the whole line and show only "100" on R console instead of showing the whole line. 当第一行到来时,我需要从整行中仅获取“100”并且在R控制台上仅显示“100”而不是显示整行。 Similarly for rest of the rows.
其余行也是如此。
[ I tried sink(). [我试过下沉()。 but it will not work as
sink
writes the complete output to output file ] 但它不会工作,因为接收
sink
将完整的输出写入输出文件]
[I searched for do.trace
option in randomForest function. [我在randomForest函数中搜索了
do.trace
选项。 but I lost myself as I suspect it calls come C program; 但我迷失了自己,因为我怀疑它来自C程序; although I am not sure.]
虽然我不确定。]
I would like to grab the real time output on R console. 我想在R控制台上获取实时输出。
Note : I have seen the following issues .
注意:我看到了以下问题。
Downloaded: https://cran.r-project.org/src/contrib/randomForest_4.6-10.tar.gz 已下载: https : //cran.r-project.org/src/contrib/randomForest_4.6-10.tar.gz
When looking at the C code for refRF.C (and I suspect classRF.C which is also called with do.trace
when it's a classification problem) and then following the 'jprint' flag which is what is received by the do.trace
-flag in the surrounding R code, we see: 当查看refRF.C的C代码时(我怀疑classRF.C,当它是分类问题时也用
do.trace
调用)然后跟随'jprint'标志,这是do.trace
收到的 -在周围的R代码中标记,我们看到:
/* print header for running output */
if (*jprint <= *nTree) {
Rprintf(" | Out-of-bag ");
if (*testdat) Rprintf("| Test set ");
Rprintf("|\n");
Rprintf("Tree | MSE %%Var(y) ");
if (*testdat) Rprintf("| MSE %%Var(y) ");
Rprintf("|\n");
}
And: 和:
/* Print running output. */
if ((j + 1) % *jprint == 0) {
Rprintf("%4d |", j + 1);
Rprintf(" %8.4g %8.2f ", errb, 100 * errb / varY);
if(*labelts == 1) Rprintf("| %8.4g %8.2f ",
errts, 100.0 * errts / varYts);
Rprintf("|\n");
}
mse[j] = errb;
if (*labelts) msets[j] = errts;
It should not be particularly difficult to trim that code to the point where it is only emitting the hundredth tree notification in a form you desire. 将代码修剪到仅以您希望的形式发出第100个树通知的程度应该不是特别困难。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.