简体   繁体   English

如何使用devtools在另一个软件包中使用并行软件包?

[英]How to use the parallel package inside another package, using devtools?

When running the following code in an R terminal: 在R终端中运行以下代码时:

library(parallel)
func <- function(a,b,c) a+b+c

testfun <- function() {
    cl <- makeCluster(detectCores(), outfile="parlog.txt")    
    res <- clusterMap(cl, func, 1:10, 11:20, MoreArgs = list(c=1))
    print(res)
    stopCluster(cl)
}

testfun()

... it works just fine. ...效果很好。 However, when I copy the two function definitions into my own package, add a line #' @import parallel , do dev_tools::load_all("mypackage") on the R terminal and then call testfun() , I get an 但是,当我将两个函数定义复制到我自己的包中时,添加一行#' @import parallel dev_tools::load_all("mypackage") #' @import parallel ,在R终端上执行dev_tools::load_all("mypackage") ,然后调用testfun()

Error in unserialize(node$con) (from myfile.r#7) :
error reading from connection

where #7 is the line containing the call to clusterMap . 其中#7是包含对clusterMap的调用的行。

So the exact same code works on the terminal but not inside a package. 因此, 完全相同的代码可在终端上运行,但不能在包装内运行。

If I take a look into parlog.txt , I see the following: 如果查看parlog.txtparlog.txt看到以下内容:

starting worker pid=7204 on localhost:11725 at 13:17:50.784
starting worker pid=4416 on localhost:11725 at 13:17:51.820
starting worker pid=10540 on localhost:11725 at 13:17:52.836
starting worker pid=9028 on localhost:11725 at 13:17:53.849
Error: (converted from warning) namespace 'mypackage' is not available and has been replaced
by .GlobalEnv when processing object ''
Error: (converted from warning) namespace 'mypackage' is not available and has been replaced
by .GlobalEnv when processing object ''
Error: (converted from warning) namespace 'mypackage' is not available and has been replaced
by .GlobalEnv when processing object ''
Error: (converted from warning) namespace 'mypackage' is not available and has been replaced
by .GlobalEnv when processing object ''

What's the root of this problem and how do I resolve it? 这个问题的根源是什么,我该如何解决?

Note that I'm doing this with a completely fresh, naked package. 请注意,我正在使用完全新鲜的裸包进行此操作。 (Created by devtools::create .) So no interactions with existing, possibly destructive code. (由devtools::create 。)因此,无需与现有的可能破坏性的代码进行交互。

While writing the question, I actually found the answer and am going to share it here. 在写问题时,我实际上找到了答案,并将在这里分享。

The problem here is the combination of the packages devtools and parallel . 这里的问题是软件包devtoolsparallel的组合。

Apparently, for some reason, parallel requires the package mypackage to be installed into some local library, even if you do not need to load it in the workers explicitly (eg using clusterEvalQ(cl, library(mypackage)) or something similar)! 显然,出于某种原因, parallel要求将mypackage软件包安装到某个本地库中, 即使您不需要显式地将其装入工作程序中 (例如,使用clusterEvalQ(cl, library(mypackage))或类似的东西)!

I was employing the usual devtools workflow, meaning that I was working in dev_mode() all of the time. 我正在使用通常的devtools工作流程,这意味着我一直都在使用dev_mode() However, this led to my package being installed just in some special dev mode folders (I do not know exactly how this works internally). 但是,这导致我的软件包仅安装在某些特殊的dev模式文件夹中(我不知道它在内部的确切运行方式)。 These are not searched by the worker processes invoked parallel , since they are not in dev_mode. 这些不是由parallel调用的工作进程搜索的,因为它们不在dev_mode中。

So here is my 'workaround': 所以这是我的“解决方法”:

## turn off dev mode
dev_mode() 
## install the package into a 'real' library
install("mypackage") 
library(mypackage)
## ... and now the following works:
mypackage:::testfun()

As Hadley just pointed out correctly, another workaround would be to add a line 正如Hadley正确指出的那样 ,另一种解决方法是添加一行

clusterEvalQ(cl, dev_mode())

right after cluster creation. 在集群创建之后。 That way, one can use the dev_mode. 这样,就可以使用dev_mode。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM