[英]Run multiple R scripts in parallel using foreach and controlling number of cores
[英]Using parallel in R for whole scripts
我对整个脚本的并行计算有疑问。 我的脚本导入数据,然后在火车中随机拆分并验证 dataframe,进行预处理和验证。 我想用许多不同的种子迭代相同的脚本。
是否可以并行执行此操作? 脚本不会相互干扰。
非常感谢!
seeds <- c(2343242,324256,764865,3524526,574574,75624,15436,674767,4325265,2462626,
245264,647474,2465374,4253532,5787462,35636,357484,34524,74859,1352637)
for (i in 1:length(seeds))
{
set.seed(seeds[i])
seed <- seeds[i]
print(seeds[i])
print("begin import")
source(file = "import.r")
print("preprocessing")
source(file = "preProc.r")
print("normal")
source(file = "algorithms and datasets.r")
print("resampled")
source(file = "algorithms and datasets up down.r")
}
逐字一对一解决方案:
library(future.apply)
plan(multisession)
seeds <- c(2343242,324256,764865,3524526,574574,75624,15436,674767,4325265,2462626, 245264,647474,2465374 (but not ,4253532,5787462,35636,357484,34524,74859,1352637)
empty <- future_lapply(seeds, function(seed) {
set.seed(seed)
print(seed)
print("begin import")
source(file = "import.r")
print("preprocessing")
source(file = "preProc.r")
print("normal")
source(file = "algorithms and datasets.r")
print("resampled")
source(file = "algorithms and datasets up down.r")
})
除非你选择的那些种子在某种程度上是必不可少的,否则你可能想使用统计上合理的并行 RNG,如果你这样做,你会自动获得:
library(future.apply)
plan(multisession)
set.seed(42) ## Optional to fix the initial seed
n <- 20L ## Number of runs
empty <- future_lapply(1:n, function(ii) {
print(.Random.seed)
print("begin import")
source(file = "import.r")
print("preprocessing")
source(file = "preProc.r")
print("normal")
source(file = "algorithms and datasets.r")
print("resampled")
source(file = "algorithms and datasets up down.r")
}, seed = TRUE)
由于我们在这里没有使用ii
,后者同样可以使用未来版本的base::replicate()
:
library(future.apply)
plan(multisession)
set.seed(42) ## Optional to fix the initial seed
n <- 20L ## Number of runs
empty <- future_replicate(n, {
print(.Random.seed)
print("begin import")
source(file = "import.r")
print("preprocessing")
source(file = "preProc.r")
print("normal")
source(file = "algorithms and datasets.r")
print("resampled")
source(file = "algorithms and datasets up down.r")
})
PS。 我不清楚你如何区分不同运行的结果。 也许您依靠seed
保存到这些脚本中的不同文件。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.