简体繁体中英

caret: Choosing the correct number of cores in parallel backend

原文 2016-11-01 17:46:30 3 1 r/ machine-learning/ parallel-processing/ r-caret

I am trying to use caret to cross-validate an elastic net model using the glmnet implementation on an Ubuntu machine with 8 CPU cores & 32 GB of RAM. When I train sequentially, I am maxing out CPU usage on one core, but using 50% of the memory on average.

When I use doMC(cores = xxx) , do I need to worry about only registering xxx = floor(100/y) cores, where y is the memory usage of the model when using a single core (in %), in order to not run out of memory?
Does caret have any heuristics that allow it to figure out the max. number of cores to use?
Is there any set of heuristics that I can use to dynamically adjust the number of cores to use my computing resources optimally across different sizes of data and model complexities?

Edit:

FWIW, attempting to use 8 cores made my machine unresponsive. Clearly caret does not check to see if the spawning xxx processes is likely to be problematic. How can I then choose the number of cores dynamically?

1 answers

Clearly caret does not check to see if the spawning xxx processes is likely to be problematic.

True; it cannot predict future performance of your computer.

You should get an understanding of how much memory you use for modeling use when running sequentially. You can start the training and use top or other methods to estimate the amount of ram used then kill the process. If sequentially you use X GB of RAM sequentially, running on M cores will require X(M+1) GB of ram.

Caret in R: Set number of cores for allowParallel?

R: DEoptim parallel optimization - number of cores

Changing the number of cores while parallel computing in R

Fixing the seed for parallel simulation runs with different number of cores

Run multiple R scripts in parallel using foreach and controlling number of cores

Whether to use the detectCores function in R to specify the number of cores for parallel processing?

R and GNU Parallel - How to limit number of cores used

R: caret does not use the master node of the PSOCKcluster when using parallel backend

R caret package createDataPartition() function not giving the correct number of partition

Parallel processing with xgboost and caret

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Caret in R: Set number of cores for allowParallel? R: DEoptim parallel optimization - number of cores Changing the number of cores while parallel computing in R Fixing the seed for parallel simulation runs with different number of cores Run multiple R scripts in parallel using foreach and controlling number of cores Whether to use the detectCores function in R to specify the number of cores for parallel processing? R and GNU Parallel - How to limit number of cores used R: caret does not use the master node of the PSOCKcluster when using parallel backend R caret package createDataPartition() function not giving the correct number of partition Parallel processing with xgboost and caret

Related Tags

caret: Choosing the correct number of cores in parallel backend

Question

Edit:

1 answers

solution1 2 2016-11-04 17:37:22

solution1
2 2016-11-04 17:37:22