[英]How to force idle workers to take jobs in parallel R?
I am new to posting here--I searched and couldn't find an answer to my question. 我是新来发布的 - 我搜索过,找不到我的问题的答案。 I have run the following R parallelized code (from a blog on parallel computing in R) using the
parallel
package on two different machines and yet get very different process time results. 我在两台不同的机器上使用
parallel
程序包运行了以下R并行化代码(来自R中并行计算的博客 ),但却获得了截然不同的处理时间结果。 The first machine is a Lenovo laptop with Windows 8, 8GB RAM, Intel i7, 2 cores/4 logical processors. 第一台机器是联想笔记本电脑,配备Windows 8,8GB RAM,Intel i7,2核/ 4逻辑处理器。 The second machine is a Dell desktop, Windows 7, 16GB RAM, Intel i7, 4 cores/8 logical processors.
第二台机器是戴尔台式机,Windows 7,16GB RAM,Intel i7,4核/ 8逻辑处理器。 The code sometimes runs much slower on the second machine.
代码有时在第二台机器上运行得慢得多。 I believe the reason is that the second machine is not using the worker nodes to complete the task.
我相信原因是第二台机器没有使用工作节点来完成任务。 When I use the function snow.time() from the
snow
package to check node usage, the first machine is using all available workers to complete the task. 当我使用
snow
包中的函数snow.time()来检查节点使用情况时,第一台机器正在使用所有可用的工作人员来完成任务。 However, on the more powerful machine, it never uses the workers--the entire task is handled by the master. 但是,在功能更强大的机器上,它从不使用工作人员 - 整个任务由主人处理。 Why is the first machine using workers, but the second is not with the exact same code?
为什么第一台机器使用工人,但第二台机器使用完全相同的代码? And how do I 'force' the second machine to use the available workers so that the code is truly parallelized and the processing time is sped up?
我如何“强迫”第二台机器使用可用的工作程序,以便代码真正并行化并且处理时间加快? The answers to these would help me tremendously with other work I am doing.
这些答案对我正在做的其他工作有很大的帮助。 Thanks in advance.
提前致谢。 The graphs from the function
snow.time()
are below as well as the code I used: 功能
snow.time()
中的图表以及我使用的代码如下:
runs <- 1e7
manyruns <- function(n) mean(unlist(lapply(X=1:(runs/4), FUN=onerun)))
library(parallel)
cores <- 4
cl <- makeCluster(cores)
# Send function to workers
tobeignored <- clusterEvalQ(cl, {
onerun <- function(.){ # Function of no arguments
doors <- 1:3
prize.door <- sample(doors, size=1)
choice <- sample(doors, size=1)
if (choice==prize.door) return(0) else return(1) # Always switch
}
; NULL
})
# Send runs to the workers
tobeignored <- clusterEvalQ(cl, {runs <- 1e7; NULL})
runtime <- snow.time(avg <- mean(unlist(clusterApply(cl=cl, x=rep(runs, 4), fun=manyruns))))
stopCluster(cl)
plot(runtime)
Try clusterApplyLB
instead of clusterApply
. 尝试使用
clusterApplyLB
而不是clusterApply
。 The "LB" is for load balancing. “LB”用于负载平衡。
The non LB version divides the number of tasks between the nodes and sends them in a batch, but if one node finishes early then it sits idle waiting for the others. 非LB版本划分节点之间的任务数量并批量发送它们,但如果一个节点提前完成,则它将空闲等待其他节点。
The LB version sends one task to each node then watches the nodes and when a node finishes it sends another task to that node until all the tasks are assigned. LB版本向每个节点发送一个任务,然后监视节点,当节点完成时,它向该节点发送另一个任务,直到分配了所有任务。 This is more efficient if the time for each task varies widely, but is less efficient if all the tasks will take about the same amount of time.
如果每个任务的时间变化很大,则效率更高,但如果所有任务花费大约相同的时间,效率会降低。
Also check the versions of R and parallel. 还要检查R和并行版本。 If I am remembering correctly the
clusterApply
function used to not do things in parallel on Windows machines (but I don't see that note any more, so that has likely been remedied in recent versions), so the difference could be different versions of the parallel package. 如果我正确记住了
clusterApply
函数曾经不能在Windows机器上并行执行(但我不再看到那个注释,所以在最近的版本中可能已经补救了),所以区别可能是不同版本的并行包。 The parLapply
function did not have the same issue, so you could rewrite your code to use it instead and see if that makes a difference. parLapply
函数没有相同的问题,因此您可以重写代码以使用它,看看是否parLapply
。
I don't think it's possible to use the snow.timing
function from the snow package while getting all of the other functions from the parallel package. 我不认为可以使用snow包中的
snow.timing
功能,同时从并行包中获取所有其他功能。 The source for parallel in R 3.2.3 has some place holder code for timing, but it doesn't appear to be either complete or compatible with the snow.timing
function in snow. R 3.2.3中的并行源有一些用于计时的占位符代码,但它似乎不完整或与雪中的
snow.timing
功能兼容。 I think you'll still get correct results from clusterApply
, but the object returned by snow.time
will be equivalent to executing: 我认为你仍然可以从
clusterApply
获得正确的结果,但snow.time
返回的对象将等同于执行:
runtime <- snow.time(Sys.sleep(20))
If you want to use snow.timing
, I suggest only loading snow, although you can still access functions such as detectCores
using the syntax parallel::detectCores()
. 如果你想使用
snow.timing
,我建议只加载雪,虽然你仍然可以使用语法parallel::detectCores()
访问detectCores
等函数。
I don't really know why your script occasionally runs slowly on your desktop machine, but I think that the way you are parallelizing it is reasonable and correct. 我真的不知道为什么你的脚本偶尔在台式机上运行缓慢,但我认为你并行化它的方式是合理和正确的。 You might want to try benchmarking
manyruns
sequentially on both machines in order to rule out any differences in the random number generation code on the two systems. 您可能希望尝试在两台计算机上按顺序对
manyruns
进行基准测试,以排除两个系统上随机数生成代码的任何差异。 But perhaps the problem was caused by a system service that was slowing down the whole system. 但问题可能是系统服务导致整个系统变慢。
I cannot put code in comments... I do no understand your program very well. 我不能把代码放在评论中......我不太了解你的程序。 What kind of cluster are you creating?
你在创建什么样的集群? Try this, adjust 2e6 to whatever works for you:
试试这个,将2e6调整为适合你的任何东西:
library(parallel)
library(Rmpi)
library(snow)
cl <- makeMPIcluster(3)
t <- system.time(parLapply(cl, 1:100, function(i) mean(rnorm(2e6))))
stopCluster(cl)
print(t)
for me it runs for 10 seconds (2 core/hyperthreading/5y old laptop/linux), all 4 workers are 100% busy. 对我来说它运行10秒钟(2核心/超线程/ 5年旧笔记本电脑/ linux),所有4名工人都100%忙碌。 You may also try the same with socket clusters.
您也可以尝试使用套接字群集。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.