XGBoost 机器学习技术中的并行性

Question

Its have to do with它与

parallelism implementation of XGBoost XGBoost 的并行实现

I am trying to optimize XGBoost execution by giving it parameter nthread = 16 where my system has 24 cores.我正在尝试通过给它参数nthread = 16 来优化 XGBoost 的执行，我的系统有 24 个内核。 But when I train my model, it doesn't seem to even cross approx 20% of CPU utilization at any point in time while model training.但是当我训练我的模型时，它似乎在模型训练的任何时间点都没有超过大约 20% 的 CPU 使用率。 Code snippet is as follows:-代码片段如下：-

param_30 <- list("objective" = "reg:linear",    # linear 
               "subsample"= subsample_30,
               "colsample_bytree" = colsample_bytree_30,
               "max_depth" = max_depth_30,    # maximum depth of tree 
               "min_child_weight" = min_child_weight_30,
               "max_delta_step" = max_delta_step_30,
               "eta" = eta_30,    # step size shrinkage 
               "gamma" = gamma_30,  # minimum loss reduction 
               "nthread" = nthreads_30,   # number of threads to be used
               "scale_pos_weight" = 1.0
)
model <- xgboost(data = training.matrix[,-5], 
              label = training.matrix[,5], 
              verbose = 1, nrounds=nrounds_30, params = param_30, 
              maximize = FALSE, early_stopping_rounds = searchGrid$early_stopping_rounds_30[x])

Please explain me ( if possible ) on how I can increase CPU utilization and speed up the model training for efficient execution.请向我解释（如果可能）如何提高 CPU 利用率并加快模型训练以实现高效执行。 Code in R shall be helpful for further understanding. R中的代码有助于进一步理解。

Assumption:- It is about the execution in R package of XGBoost假设：- 它是关于 XGBoost 的 R 包中的执行

Answer 1

This is a guess... but I have had this happen to me...这是一个猜测......但我遇到过这种情况......

You are spending to much time communicating during the parallelism and are not ever getting CPU bound.您在并行期间花费大量时间进行通信，并且永远不会受到 CPU 限制。 https://en.wikipedia.org/wiki/CPU-bound https://en.wikipedia.org/wiki/CPU-bound

Bottom line is your data isn't large enough (rows and columns ), and/or your trees aren't deep enough max_depth to warrant that many cores.底线是您的数据不够大（行和列），和/或您的树的深度不够max_depth以保证那么多内核。 Too much overhead.太多的开销。 xgboost parallelizes split evaluations so deep trees on big data can keep the CPU humming at max. xgboost并行化拆分评估，因此大数据上的深层树可以使 CPU 保持最大运转。

I have trained many models where single threaded outperforms 8/16 cores.我训练过许多单线程优于 8/16 核的模型。 Too much time switching and not enough work.切换时间太多，工作量不够。

**MORE DATA, DEEPER TREES OR LESS CORES:) ** **更多数据，更深的树或更少的核心:) **

Answer 2

I tried to answer this question but my post was deleted by a moderator.我试图回答这个问题，但我的帖子被版主删除了。 Please see https://stackoverflow.com/a/67188355/5452057 which I believe could help you also, it relates to missing MPI support in the xgboost R-package for Windows available from CRAN.请参阅https://stackoverflow.com/a/67188355/5452057我相信它也可以帮助您，它与 CRAN 提供的适用于 Windows 的xgboost R 包中缺少 MPI 支持有关。

XGBoost 机器学习技术中的并行性

问题描述

2 个解决方案

解决方案1
0 2017-02-09 22:02:03

解决方案2
0 2021-04-23 02:08:12

XGBoost 机器学习技术中的并行性

问题描述

2 个解决方案

解决方案1 0 2017-02-09 22:02:03

解决方案2 0 2021-04-23 02:08:12

解决方案1
0 2017-02-09 22:02:03

解决方案2
0 2021-04-23 02:08:12