[英]R: caret does not use the master node of the PSOCKcluster when using parallel backend
我试图让caret
使用并行后端在超参数网格上训练xgboost
模型。
这是一些代码,使用“ 给我一些信誉”数据来演示为caret
的超参数网格搜索设置并行后端。
library(plyr)
library(dplyr)
library(pROC)
library(caret)
library(xgboost)
library(readr)
library(parallel)
library(doParallel)
if(exists("xgboost_cluster")) stopCluster(xgboost_cluster)
hosts = paste0("192.168.18.", 52:53)
xgboost_cluster = makePSOCKcluster(hosts, master="192.168.18.51")
# load the packages across the cluster
clusterEvalQ(xgboost_cluster, {
deps = c("plyr", "Rcpp", "dplyr", "caret", "xgboost", "pROC", "foreach", "doParallel")
for(d in deps) library(d, character.only = TRUE)
rm(d, deps)
})
registerDoParallel(xgboost_cluster)
# load in the training data
df_train = read_csv("04-GiveMeSomeCredit/Data/cs-training.csv") %>%
na.omit() %>% # listwise deletion
select(-`[EMPTY]`) %>%
mutate(SeriousDlqin2yrs = factor(SeriousDlqin2yrs, # factor variable for classification
labels = c("Failure", "Success")))
# set up the cross-validated hyper-parameter search
xgb_grid_1 = expand.grid(
nrounds = 1000,
eta = c(0.01, 0.001, 0.0001),
max_depth = c(2, 4, 6, 8, 10),
gamma = 1
)
# pack the training control parameters
xgb_trcontrol_1 = trainControl(
method = "cv",
number = 5,
verboseIter = TRUE,
returnData = FALSE,
returnResamp = "all", # save losses across all models
classProbs = TRUE, # set to TRUE for AUC to be computed
summaryFunction = twoClassSummary,
allowParallel = TRUE
)
# train the model for each parameter combination in the grid,
# using CV to evaluate
xgb_train_1 = train(
x = as.matrix(df_train %>%
select(-SeriousDlqin2yrs)),
y = as.factor(df_train$SeriousDlqin2yrs),
trControl = xgb_trcontrol_1,
tuneGrid = xgb_grid_1,
method = "xgbTree"
)
我检查了hosts
上的所有核心是否都已用于培训,但是在master
节点上没有使用任何进程。 这是预期的行为吗? 有什么方法可以改变这种行为,并利用主节点上的核心进行处理?
为了利用主节点进行处理,您只需要向hosts
添加'localhost'
,就像这样:
hosts = c("localhost", paste0("192.168.18.", 52:53))
这会将主节点上的一个核心添加到群集,然后将其用于处理。 如果要添加多个内核,只需传入更多的'localhost'
实例即可。
hosts = c(rep('localhost', detectCores()), paste0("192.168.18.", 52:53))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.