[英]R: Parallelization with doParallel and foreach
I made the following sequential mini example in R: 我在R中制作了以下连续微型示例:
all_list <- list()
all_list[1] <- list(1:6000)
all_list[2] <- list(100000:450000)
all_list[3] <- list(600000:1700000)
all_list[4] <- list(2000000:3300000)
all_list[5] <- list(3600000:5000000)
find <- list(c(12800, 12800, 12800, 25600, 51200, 102400, 204800, 409600, 819200, 1638400, 1638400, 2457600, 3276800, 4096000, 4915200, 4915200))
result <- list()
index <- 1
current_Intervall <- 1
current_number <- 1
while(current_number <= 5000000){
for(i in 1:length(find[[1]])){
if(current_number == find[[1]][i]){
result[[index]] <- current_number
index <- index + 1
break
}
}
current_number <- current_number + 1
last <- lengths(all_list[current_Intervall])
if(current_number > all_list[[current_Intervall]][last]){
if(current_Intervall == length(all_list)){
break
}else{
current_Intervall <- current_Intervall + 1
current_number <- all_list[[current_Intervall]][1]
}
}
print(current_number)
}
I want to make this code parallel for Windows. 我想将此代码并行化为Windows。 I thought of the doParallel package and foreach loops, because I did not find a package, which supported parallel while loops.
我想到了doParallel软件包和foreach循环,因为我没有找到一个支持并行while循环的软件包。 Now I have tried this:
现在,我已经尝试过了:
library(doParallel)
all_list <- list()
all_list[1] <- list(1:6000)
all_list[2] <- list(100000:450000)
all_list[3] <- list(600000:1700000)
all_list[4] <- list(2000000:3300000)
all_list[5] <- list(3600000:5000000)
find <- list(c(12800, 12800, 12800, 25600, 51200, 102400, 204800, 409600, 819200, 1638400, 1638400, 2457600, 3276800, 4096000, 4915200, 4915200))
result <- list()
index <- 1
current_Intervall <- 1
current_number <- 1
no_cores <- detectCores() - 1
cl <- makeCluster(no_cores)
registerDoParallel(cl)
print(current_number)
foreach(current_number=1:5000000) %dopar% {
for(i in 1:length(find[[1]])){
if(current_number == find[[1]][i]){
result[[index]] <- current_number
index <- index + 1
break
}
}
# current_number <- current_number + 1
last <- lengths(all_list[current_Intervall])
if(current_number > all_list[[current_Intervall]][last]){
if(current_Intervall == length(all_list)){
break
}else{
current_Intervall <- current_Intervall + 1
current_number <- all_list[[current_Intervall]][1]
}
}
print(current_number)
}
stopCluster(cl)
But the print output does not print anything and after about 2 minutes the loop does not terminate. 但是打印输出不会打印任何内容,并且大约2分钟后循环不会终止。 But the sequential example holds after some seconds.
但是几秒钟之后,顺序示例仍然成立。 I think there is something wrong.
我认为出了点问题。
Another questions is: Is it possible to redefine the counter number in foreach loops? 另一个问题是:是否可以在foreach循环中重新定义计数器编号? In the above while loop I can set the counter "current_number" arbitary.
在上面的while循环中,我可以将计数器“ current_number”设置为任意。 But I think in R, for loops does not allow to redefine the counter number, right?
但是我认为在R中,for循环不允许重新定义计数器编号,对吗? Is there maybe a better package or alternative loop to parallelize the first example?
是否可能有更好的程序包或替代循环来并行化第一个示例?
Best regards, Brayn 最好的问候,布赖恩
如果要在使用并行性时输出某些内容,请使用makeCluster(no_cores, outfile = "")
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.