简体   繁体   English

R parLapply不平行

[英]R parLapply not parallel

I'm currently developing an R package that will be using parallel computing to solve some tasks, through means of the "parallel" package. 我正在开发一个R包,它将使用并行计算来解决一些任务,通过“并行”包。

I'm getting some really awkward behavior when utilizing clusters defined inside functions of my package, where the parLapply function assigns a job to a worker and waits for it to finish to assign a job to next worker. 当我使用在我的包的函数内定义的集群时,我会遇到一些非常尴尬的行为,其中parLapply函数将一个作业分配给一个worker并等待它完成为下一个worker分配一个作业。 Or at least this is what appears to be happening, through the observation of the log file "cluster.log" and the list of running processes in the unix shell. 或者至少通过观察日志文件“cluster.log”和unix shell中正在运行的进程列表,这似乎正在发生。

Below is a mockup version of the original function declared inside my package: 下面是我的包中声明的原始函数的模型版本:

.parSolver <- function( varMatrix, var1 ) {

    no_cores <- detectCores()

    #Rows in varMatrix
    rows <- 1:nrow(varMatrix[,])

    # Split rows in n parts
    n <- no_cores
    parts <- split(rows, cut(rows, n))

    # Initiate cluster
    cl <- makePSOCKcluster(no_cores, methods = FALSE, outfile = "/home/cluster.log")
    clusterEvalQ(cl, library(raster))
    clusterExport(cl, "varMatrix", envir=environment())
    clusterExport(cl, "var1", envir=environment())


    rParts <- parLapply(cl = cl, X = 1:n, fun = function(x){
        part <- rasterize(varMatrix[parts[[x]],], raster(var1), .....)
        print(x)
        return(part)
        })

    do.call(merge, rParts)
}

NOTES: 笔记:

  • I'm using makePSOCKcluster because i want the code to run on windows and unix systems alike although this particular problem is only manifesting itself in a unix system. 我正在使用makePSOCKcluster,因为我希望代码能够在Windows和unix系统上运行,尽管这个特殊问题只能在unix系统中体现出来。
  • Functions rasterize and raster are defined in library(raster), exported to the cluster. 函数栅格化和栅格在库(栅格)中定义,导出到集群。

The weird part to me is if I execute the exact same code of the function parSolver in a global environment every thing works smoothly, all workers take one job at the same time and the task completes in no time. 对我来说,奇怪的部分是如果我在全局环境中执行完全相同的函数parSolver的代码,每个事情都能顺利进行,所有工作人员同时完成一项工作,任务就很快完成。 However if I do something like: 但是,如果我这样做:

library(myPackage)

varMatrix <- (...)
var1 <- (...)
result <- parSolver(varMatrix, var1)

the described problem appears. 出现所描述的问题。

It appears to be a load balancing problem however that does not explain why it works ok in one situation and not in the other. 它似乎是一个负载平衡问题,然而这并不能解释为什么它在一种情况下工作正常而在另一种情况下不工作。

Am I missing something here? 我在这里错过了什么吗? Thanks in advance. 提前致谢。

I don't think parLapply is running sequentially. 我不认为parLapply按顺序运行。 More likely, it's just running inefficiently, making it appear to run sequentially. 更有可能的是,它只是运行效率低下,使其看起来按顺序运行。

I have a few suggestions to improve it: 我有一些改进建议:

  • Don't define the worker function inside parSolver 不要在parSolver定义worker函数
  • Don't export all of varMatrix to each worker 不要将所有varMatrix导出到每个worker
  • Create the cluster outside of parSolver parSolver之外创建集群

The first point is important, because as your example now stands, all of the variables defined in parSolver will be serialized along with the anonymous worker function and sent to the workers by parLapply . 第一点很重要,因为正如您的示例所示, parSolver定义的所有变量将与匿名工作函数一起序列化,并通过parLapply发送给parLapply By defining the worker function outside of any function, the serialization won't capture any unwanted variables. 通过在任何函数之外定义worker函数,序列化将不会捕获任何不需要的变量。

The second point avoids unnecessary socket I/O and uses less memory, making the code more scalable. 第二点避免了不必要的套接字I / O并使用更少的内存,使代码更具可伸缩性。

Here's a fake, but self-contained example that is similar to yours that demonstrates my suggestions: 这是一个假的,但是自包含的示例,与您的类似,展示了我的建议:

# Define worker function outside of any function to avoid
# serialization problems (such as unexpected variable capture)
workerfn <- function(mat, var1) {
    library(raster)
    mat * var1
}

parSolver <- function(cl, varMatrix, var1) {
    parts <- splitIndices(nrow(varMatrix), length(cl))
    varMatrixParts <- lapply(parts, function(i) varMatrix[i,,drop=FALSE])
    rParts <- clusterApply(cl, varMatrixParts, workerfn, var1)
    do.call(rbind, rParts)
}

library(parallel)
cl <- makePSOCKcluster(3)
r <- parSolver(cl, matrix(1:20, 10, 2), 2)
print(r)

Note that this takes advantage of the clusterApply function to iterate over a list of row-chunks of varMatrix so that the entire matrix doesn't need to be sent to everyone. 请注意,这需要的优势clusterApply功能遍历的行块列表varMatrix使整个矩阵并不需要发送给大家。 It also avoids calls to clusterEvalQ and clusterExport , simplifying the code, as well as making it a bit more efficient. 它还避免了对clusterEvalQclusterExport调用,简化了代码,并使其更高效。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM