![](/img/trans.png)
[英]mlr / parallelMap: How to pass libPaths to workers when working with checkpoint?
[英]How to set .libPaths (checkpoint) on workers when running parallel computation in R
我使用 checkpoint 包進行可重現的數據分析。 有些計算需要很長時間才能計算,所以我想並行運行它們。 但是當並行運行時,檢查點沒有在工作人員上設置,所以我收到一條錯誤消息“沒有名為 xy 的包” (因為它沒有安裝在我的默認庫目錄中)。
我如何確保每個工作人員都使用檢查點文件夾中的包版本? 我試圖在 foreach 代碼中設置 .libPaths 但這似乎不起作用。 我還希望全局設置一次檢查點/libPaths,而不是在每個 foreach 調用中設置一次。
另一種選擇可能是更改 .Rprofile 文件,但我不想這樣做。
checkpoint::checkpoint("2018-06-01")
library(foreach)
library(doFuture)
library(future)
doFuture::registerDoFuture()
future::plan("multisession")
l <- .libPaths()
# Code to run in parallel does not make much sense of course but I wanted to keep it simple.
res <- foreach::foreach(
x = unique(iris$Species),
lib.path = l
) %dopar% {
.libPaths(lib.path)
stringr::str_c(x, "_")
}
{ 中的錯誤:任務 2 失敗 - “沒有名為 'stringr' 的包”
未來包的作者在這里。
更新 2022-05-25:從未來的 1.20.0 (2021-11-03) 開始,多會話並行工作器會自動從主 R 會話繼承 R 庫路徑 (= .libPaths()
)。 因此,不再需要以下解決方法。 但是,其他未來的后端可能仍需要它。
將主 R 進程的庫路徑作為全局變量libs
傳遞並使用.libPaths(libs)
為每個工作人員設置它就足夠了;
## Use CRAN checkpoint from 2018-07-24 to get future (>= 1.9.0) [1],
## otherwise the below stdout won't be relayed back to the master
## R process, but settings .libPaths() does also work in older
## versions of the future package.
## [1] https://cran.microsoft.com/snapshot/2018-07-24/web/packages/future
checkpoint::checkpoint("2018-07-24")
stopifnot(packageVersion("future") >= "1.9.0")
libs <- .libPaths()
print(libs)
### [1] "/home/hb/.checkpoint/2018-07-24/lib/x86_64-pc-linux-gnu/3.5.1"
### [2] "/home/hb/.checkpoint/R-3.5.1"
### [3] "/usr/lib/R/library"
library(foreach)
doFuture::registerDoFuture()
future::plan("multisession")
res <- foreach::foreach(x = unique(iris$Species)) %dopar% {
## Use the same library paths as the master R session
.libPaths(libs)
cat(sprintf("Library paths used by worker (PID %d):\n", Sys.getpid()))
cat(sprintf(" - %s\n", sQuote(.libPaths())))
stringr::str_c(x, "_")
}
### - ‘/home/hb/.checkpoint/2018-07-24/lib/x86_64-pc-linux-gnu/3.5.1’
### - ‘/home/hb/.checkpoint/R-3.5.1’
### - ‘/usr/lib/R/library’
### Library paths used by worker (PID 9394):
### - ‘/home/hb/.checkpoint/2018-07-24/lib/x86_64-pc-linux-gnu/3.5.1’
### - ‘/home/hb/.checkpoint/R-3.5.1’
### - ‘/usr/lib/R/library’
### Library paths used by worker (PID 9412):
### - ‘/home/hb/.checkpoint/2018-07-24/lib/x86_64-pc-linux-gnu/3.5.1’
### - ‘/home/hb/.checkpoint/R-3.5.1’
### - ‘/usr/lib/R/library’
str(res)
### List of 3
### $ : chr "setosa_"
### $ : chr "versicolor_"
### $ : chr "virginica_"
僅供參考,在未來的路線圖中,可以更輕松地將庫路徑傳遞給工作人員。
我的細節:
> sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.1 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] foreach_1.4.4
loaded via a namespace (and not attached):
[1] drat_0.1.4 compiler_3.5.1 BiocManager_1.30.2 parallel_3.5.1 tools_3.5.1 listenv_0.7.0 doFuture_0.6.0
[8] codetools_0.2-15 iterators_1.0.10 digest_0.6.15 globals_0.12.1 checkpoint_0.4.5 future_1.9.0
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.