简体   繁体   English

具有外部 %dopar% 和内部 %do% 的不相关嵌套 foreach

[英]unrelated nested foreach with an outer %dopar% and an inner %do%

I am running tasks locally in parallel using %dopar% from the foreach package using the doSNOW package to create the cluster (running this on a windows machine at the moment).我使用来自foreach包的%dopar%并行运行本地任务,使用doSNOW包创建集群(目前在 Windows 机器上运行)。 I have done this many times before and it works fine until I place an unrelated foreach loop using a %do% (ie non-parallel) inside of it.我之前已经做过很多次了,它工作正常,直到我在其中使用%do% (即非并行)放置一个不相关的foreach循环。 Then R gives me the error (with traceback) :然后 R 给了我错误(带回溯):

 Error in { : task 1 failed - "could not find function "%do%""  3 stop(simpleError(msg, call = expr))  2 e$fun(obj, substitute(ex), parent.frame(), e$data)  1 foreach(rc = 1:5) %dopar% {
    aRandomCounter = -1
    if (1 > 0) {
        for (batchi in 1:20) { ...

Here is some code that replicates the problem on my machine:这是一些在我的机器上复制问题的代码:

require(foreach)
require(doSNOW)
cl<-makeCluster(5) 
registerDoSNOW(cl)
for(stepi in 1:10)  # normal outer for
{
  foreach(rc=1:5) %dopar% # the time consuming stuff in parallel (not looking to actually retrieve any data)
  {
    aRandomCounter = -1
    if(1 > 0)
    {
      for(batchi in 1:20) 
      {
        anObjectIwantToCreate <- foreach( qrc = 1:100, .combine=c ) %do% 
        {
          return(runif(1)) # I know this is not efficient, it is a placeholder to reproduce the issue
        }
        aRandomCounter = aRandomCounter + sum(anObjectIwantToCreate > 0.5)
      } 
    }
    return(aRandomCounter)
  }
}
stopCluster(cl)

Replacing the inner foreach with a simple for or (l/s)apply is a solution.用简单的for(l/s)apply替换内部foreach是一种解决方案。 But is there a way to make this work with the inner foreach and why the error in the first place ?但是有没有办法让这个与内部foreach一起工作,为什么首先会出现错误?

Of course, I got it to work as soon as I posted it (sorry.. I will leave it in case someone else has the same issue).当然,我一发布就让它工作了(对不起..我会留下它以防其他人有同样的问题)。 It is a scoping issue - I knew you had to load any external packages within the %dopar% , but what I did not realize is that that includes the foreach package itself.这是一个范围问题 - 我知道您必须在%dopar%加载任何外部包,但我没有意识到这包括foreach包本身。 Here is the solution:这是解决方案:

require(foreach)
require(doSNOW)
cl<-makeCluster(5) 
registerDoSNOW(cl)
for(stepi in 1:10)  # normal outer for
{
  foreach(rc=1:5) %dopar% # the time consuming stuff in parallel (not looking to actually retrieve any data)
  {
    require(foreach) ### the solution
    aRandomCounter = -1
    if(1 > 0) 
    {
      for(batchi in 1:20) 
      {
        anObjectIwantToCreate <- foreach( qrc = 1:100, .combine=c ) %do% 
        {
          return(runif(1))
        }
        aRandomCounter = aRandomCounter + sum(anObjectIwantToCreate > 0.5)
      } 
    }
    return(aRandomCounter)
  }
}
stopCluster(cl)
  • I know this is an outdate question, but just to give a hint for those who do not get nested foreach to work.我知道这是一个过时的问题,但只是给那些没有嵌套 foreach 工作的人一个提示。
  • If parallelizing outer loop with putting %do% in %dopar% , you would need to include .packages = c("doSNOW") in the augment of the outer loop (%dopar%), otherwise you will run into "doSNOW not found" error.如果通过将%do% in %dopar%.packages = c("doSNOW")循环,则需要在外循环 (%dopar%) 的扩充中包含.packages = c("doSNOW") ),否则您将遇到"doSNOW not found"错误。
  • Generally, people just parallelize inner loop (%dopar% in %:%), which can be slow for a huge amount of data (waiting for combinations of inner loops).通常,人们只是并行化内循环(%dopar% in %:%),这对于大量数据(等待内循环的组合)来说可能很慢。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM