R drake文件输出带变量的名称

Question

I am using drake to create multiple output files, where I want to specify the path by a variable. 我正在使用drake创建多个输出文件，我想通过变量指定路径。 Something like 就像是

outpath <- "data"
outfile <- file.path(outpath, "mydata.csv")
write.csv(df, outfile)

But file_out doesn't seem to work with arguments given to it other than literal characters. 但是file_out似乎不适用于除了文字字符之外的赋予它的参数。

To give a small code example: 给出一个小代码示例：

Code setup 代码设置

library(drake)

outpath <- "data"
# for reproducibility only
if (!dir.exists(outpath)) dir.create(outpath)

make_data <- function() data.frame(x = 1:10, y = rnorm(10))

Working Code 工作守则

directly specifying the file: 直接指定文件：

p0 <- drake_plan(
  df = make_data(),
  write.csv(df, file_out("data/mydata0.csv"))
)
make(p0)
#> target file "data/mydata0.csv"

Failing Code 失败的代码

using file.path to construct the outfile 使用file.path构造outfile

p1 <- drake_plan(
  df = make_data(),
  write.csv(df, file_out(file.path(outpath, "mydata1.csv")))
)
make(p1)
#> target file "mydata1.csv"
#> Error: The file does not exist: mydata1.csv
#> In addition: Warning message:
#> File "mydata1.csv" was built or processed,
#> but the file itself does not exist.

I guess drake finds only the literal string as a target and not the result of file.path(...) , for example, this fails as well 我猜drake只找到文字字符串作为目标，而不是file.path(...)的结果，例如，这也失败了

p2 <- drake_plan(
  df = make_data(),
  outfile = file.path(outpath, "mydata1.csv"),
  write.csv(df, file_out(outfile))
)
#> Error: found an empty file_out() in command: write.csv(df, file_out(outfile))

Any idea how to fix that? 知道怎么解决这个问题吗？

Answer 1

Sorry I am so late to this thread. 对不起，我这个帖子太晚了。 I can more easily find questions with the drake-r-package tag. 我可以使用drake-r-package标签更轻松地找到问题。

Thanks to @Alexis for providing the link to the relevant thread. 感谢@Alexis提供相关主题的链接。 Wildcards can really help here. 通配符在这里真的很有帮助。

All your targets, input files, and output files need to be explicitly named in advance. 您需要事先明确指定所有目标，输入文件和输出文件。 This is so drake can figure out all the dependency relationships without evaluating any code in your plan. 这样， drake可以在不评估计划中的任何代码的情况下找出所有依赖关系。 Since drake is responsible for figuring out which targets to build when, I am probably not going to relax this requirement in future development. 由于drake负责确定何时建立目标，我可能不会在未来的发展中放松这一要求。

For what it's worth, tidy evaluation may also help. 对于它的价值，整洁的评估也可能有所帮助。

library(drake) # version 5.3.0
pkgconfig::set_config("drake::strings_in_dots" = "literals")
file <- file.path("dir", "mydata1.csv")
drake_plan(
  df = make_data(),
  output = write.csv(df, file_out(!!file))
)
#> # A tibble: 2 x 2
#>   target         command                                       
#> * <chr>          <chr>                                         
#> 1 df             make_data()                                   
#> 2 output         "write.csv(df, file_out(\"dir/mydata1.csv\"))"

EDIT: metaprogramming 编辑：元编程

I recently added a lengthy section of the manual on metaprogramming . 我最近在元编程手册中添加了一个冗长的部分。 If you want more flexible and automated ways to generate workflow plan data frames, you may have to abandon the drake_plan() function and do more involved tidy evaluation. 如果您想要更灵活和自动化的方式来生成工作流计划数据框，您可能不得不放弃drake_plan()函数并进行更多涉及整洁的评估。 The discussion on the issue tracker is also relevant. 关于问题跟踪器的讨论也是相关的。

R drake文件输出带变量的名称

问题描述

Code setup 代码设置

Working Code 工作守则

Failing Code 失败的代码

1 个解决方案

解决方案1
3 已采纳 2018-07-27 20:26:58

EDIT: metaprogramming 编辑：元编程

R drake文件输出带变量的名称

问题描述

Code setup 代码设置

Working Code 工作守则

Failing Code 失败的代码

1 个解决方案

解决方案1 3 已采纳 2018-07-27 20:26:58

EDIT: metaprogramming 编辑：元编程

解决方案1
3 已采纳 2018-07-27 20:26:58