简体   繁体   English

R和Knitr / HTML - 处理需要很长时间才能运行的脚本

[英]R and Knitr/HTML - dealing with scripts that take a long time to run

I'm documenting some code that will be read by other students, and knitr seems a good way to do that. 我正在记录一些将被其他学生阅读的代码,knitr似乎是一个很好的方法。 One point that is bothering me is that, for scripts that require a long time to run, my approach is not efficient. 困扰我的一点是,对于需要很长时间才能运行的脚本,我的方法效率不高。

Suppose I have something like 假设我有类似的东西

<!--begin.rcode example1, fig.width=8, fig.height=10
input <- data[,c("K12","K23","delta")]
output <- data[,"Class"]
startTime <- proc.time()
result <- C5.0(input, output)
totalTime <- proc.time()-startTime
cat("Execution time: ", totalTime[3], "\n")
plot(result)
result
end.rcode-->

This creates, plots and prints a decision tree based on a data frame called data. 这将基于称为数据的数据框创建,绘制和打印决策树。 I would like to have several chunks like this in a single .Rhtml document, and in each chunk I'd change the data set or the parameters of the algorithm. 我希望在单个.Rhtml文档中有几个这样的块,并且在每个块中我将更改数据集或算法的参数。

If the data set is large, the call to C5.0 will take some time. 如果数据集很大,则调用C5.0将需要一些时间。 If I add some other examples to the same .Rhtml file, I'd have to rerun it to create the .html file and figures. 如果我在同一个.Rhtml文件中添加一些其他示例,我必须重新运行它来创建.html文件和数字。 I've been doing this repeatedly, since I want to comment on the results of the execution, and in order to create the .html file I need to knitr the .Rhtml file again, which means rerunning the code. 我一直这样做,因为我想评论执行的结果,并且为了创建.html文件,我需要再次编织.Rhtml文件,这意味着重新运行代码。

What I am looking for is a way to either tell knitr that it can reuse the previous results of executing a chunk (don't see a way to do that, besides, seems risky and manual) or break the .Rhtml in pieces that can be knitr'd separately but still make a whole .html when I need to do it -- even better, something like a "make" for knitr that would rerun only the changed .Rhtmls and create a single .html file. 我正在寻找的方法是告诉knitr它可以重用以前的执行块的结果(看不到这样做的方法,此外,似乎有风险和手动)或打破.Rhtml片断,可以单独编织,但仍然需要做一整个.html - 甚至更好,像knitr的“make”,只会重新运行已更改的.Rhtmls并创建一个.html文件。 By the way, if anyone knows how to knitr a .Rhtml file from the command line that could also be useful -- I'm using RStudio for convenience, but a single command line command would also be helpful. 顺便说一句,如果有人知道如何从命令行编写一个.Rhtml文件也可能有用 - 我使用RStudio是为了方便,但单个命令行命令也会有所帮助。

I know this seems subjective, but I am not looking for a better (subjectively, "my approach is better than yours") way to do a task -- any approach that works without the need to rerun the whole .Rhtml page will do. 我知道这似乎是主观的,但我并不是在寻找一个更好的(主观上,“我的方法比你的方法更好”)做任务的方式 - 任何方法都可以工作而不需要重新运行整个.Rhtml页面就行了。

thanks 谢谢

One very easy way of combining long running code with knitr is to use the cache option. 将长时间运行的代码与knitr相结合的一种非常简单的方法是使用缓存选项。 To use caching, just add cache=TRUE . 要使用缓存,只需添加cache=TRUE

My personal experience is that every so often you have to delete the directories that cache makes and do a clean build. 我个人的经验是,你经常需要删除缓存所做的目录并进行干净的构建。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM