限制clusterMap（R并行包）中每个子问题的计算时间

Question

我需要以非常不同的运行时间在一系列输入参数上运行任务。 我使用parallel :: clusterMap（）与动态调度并行执行此操作。 有时，对于单个问题，计算时间就是不可行的。 在预定义的时间限制后，是否有任何方法可以杀死群集并仍然检索完成的任务？

如果我仅设置timeout参数，则集群将被杀死，而不会检索已经完成的任务。 最小的例子（不起作用！）：

f <- function(t) {Sys.sleep(t); return(t)}
t <- c(1, 1, 2, 15, 2, 1, 3, 4, 1, 1, 1,3)
cl <- makeCluster(3, timeout = 5)
as.numeric(clusterMap(cl, f, t, .scheduling = "dynamic"))
stopCluster(cl)

Answer 1

我不会杀死工人，而是让工人在一段时间后自行停下来。

这是一个非常接近您发布的代码的示例：每个工作人员都处于活动状态t秒钟，但不超过4秒钟。 t或4秒后，它将停止并返回到目前为止已完成的操作：

f <- function(t) {
  executionTime <- 0
  while(executionTime < t & executionTime < 4) {
    executionTime <- executionTime + 1
    Sys.sleep(1)
  }
  return(executionTime)
}
t <- c(1, 1, 2, 15, 2, 1, 3, 4, 1, 1, 1,3)


cl <- makeCluster(3)
print(as.numeric(clusterMap(cl, f, t, .scheduling = "dynamic")))
stopCluster(cl)

# [1] 1 1 2 4 2 1 3 4 1 1 1 3

请注意第四个元素是4，而不是15。工作人员迭代4次/ 4秒，然后停止并返回4。

限制clusterMap（R并行包）中每个子问题的计算时间

问题描述

1 个解决方案

解决方案1
1 已采纳 2015-08-06 12:59:59

限制clusterMap（R并行包）中每个子问题的计算时间

问题描述

1 个解决方案

解决方案1 1 已采纳 2015-08-06 12:59:59

解决方案1
1 已采纳 2015-08-06 12:59:59