简体   繁体   English

函数参数中的变量在并行计算时不会传递给集群

[英]Variables in function arguments do not pass to cluster when parallel computing

I am having difficulties understanding how variables are scoped/passed to the functions when interacting with the parallel package 我很难理解变量在与并行包交互时如何作用域/传递给函数

library(parallel)

test <- function(a = 1){
  no_cores <- detectCores()-1
  clust <- makeCluster(no_cores)
  result <- parSapply(clust, 1:10, function(x){a + x})
  stopCluster(clust)
  return(result)
}

test()
[1]  4  5  6  7  8  9 10 11 12 13

x = 1
test(x)

Error in checkForRemoteErrors(val) : 
3 nodes produced errors; first error: object 'x' not found

test() works but test(x) doesn't. test()有效,但test(x)没有。 When I modify the function as follows, it works. 当我按如下方式修改函数时,它可以工作。

test <- function(a = 1){
  no_cores <- detectCores()-1
  clust <- makeCluster(no_cores)
  y = a
  result <- parSapply(clust, 1:10, function(x){y + x})
  stopCluster(clust)
  return(result)
}

x = 1
test(x)

Can someone explain what is going on in memory? 有人能解释记忆中发生了什么吗?

This is due to lazy evaluation. 这是由于懒惰的评估。 The argument a is not evaluated in the function call untill its first use. 参数a不在函数调用中计算,直到第一次使用。 In first case, the cluster does not known a since it has not been evaluated in the parent environment. 在第一种情况下,群集不知道a因为它尚未在父环境中进行评估。 You can fix it by forcing the evaluation: 您可以通过强制评估来修复它:

test <- function(a = 1){
    no_cores <- detectCores()-1
    clust <- makeCluster(no_cores)
    force(a)    # <------------------------
    result <- parSapply(clust, 1:10, function(x){a + x})
    stopCluster(clust)
    return(result)
}

x = 1
test(x)
#  [1]  2  3  4  5  6  7  8  9 10 11

I would preferably use foreach() instead of parSapply() : 我最好使用foreach()而不是parSapply()

library(doParallel)

test <- function(a = 1) {
  no_cores <- detectCores() - 1
  registerDoParallel(clust <- makeCluster(no_cores))
  on.exit(stopCluster(clust), add = TRUE)
  foreach(x = 1:10, .combine = 'c') %dopar% { a + x }
}

You don't need to force a to be evaluated when using foreach() . 你并不需要强制a使用时要评估foreach() Moreover, you can register the parallel backend outside the function if you want. 此外,如果需要,可以在函数外部注册并行后端。

See a tutorial on using foreach() there (disclaimer: I'm the author of the tuto). 详情请参阅使用教程foreach() (免责声明:我的政党成员的作者)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM