简体繁体中英

Setting up H2O cluster while running several R programs (close to 20) that needs access to cluster

原文 2020-01-01 11:55:13 2 2 r/ h2o

I have to run the same R script in parallel (via batches) with different parameters. The R script builds and scores a H2O model. In this case should I,

Set up an individual cluster for each batch run of the R script?

(OR)

Create a common cluster and set the scripts to use it?

I would prefer the latter solution, but I am not sure how to automate initialization and shutting down of the H2O cluster for so many batches. The first batch has to create the cluster (H2O.init() and the last batch has to shut it down)

2 answers

Setting up individual h2o cluster per R session is ideal.

While initiating a h2o cluster with h2o::h2o.init() , Make sure you specify these differently for each R session (each script running its own R session):

ip / port (port under localhost which is already not taken)
name (to check its progress/usage on terminal via top/htop)

Change other options as required. Each R session knows the h2o cluster it is running and h2o::h2o.shutdown() will only shutdown the specific h2o cluster.

Set up a single cluster, and have all scripts use it is the recommended approach, because it is more efficient. There is memory overhead for each cluster, so your 20 separate clusters would be wasteful (even more so if there are any static data tables all your scripts need to use). You'd also have to guess the correct amount to give to each one.

On the other hand, if your 20 scripts are each going to be referring to a specific table, eg loading it with their own data, and generally assuming they are the only script running, you will have a problem: you either need to modify the scripts to be well-behaved or run each on its own ip/port.

I am not sure how to automate initialization and shutting down of the H2O cluster for so many batches. The first batch has to create the cluster (H2O.init() and the last batch has to shut it down)

Start H2O from the commandline before running the first script, and manually kill it after all scripts have completed. By doing it this way, each script will discover it is already running when they do their h2o.init() call.

If you have to be fully automatic, make sure the launch command will run first, but you'll need some kind of watcher script to notice when all the other processes have completed. (I tend to run a combination of ps and grep on cron jobs; there are more sophisticated ways, of course.)

Start multiple h2o cluster from within R

Launch an H2O cluster on localhost

H2O cluster node is behaving slowly

H2o not fully accessing memory on a cluster

Freeing up resources while running loops in h2o

How to connect R to a running H20 cluster on hadoop

Multi Node H2O cluster in R not detecting other EC2 instances

How to prevent h2o cluster shutdown without notice using R

Error in installing H2O ai R package in BigInsights cluster in Bluemix

Running H2O with Java 16 on R

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Start multiple h2o cluster from within R Launch an H2O cluster on localhost H2O cluster node is behaving slowly H2o not fully accessing memory on a cluster Freeing up resources while running loops in h2o How to connect R to a running H20 cluster on hadoop Multi Node H2O cluster in R not detecting other EC2 instances How to prevent h2o cluster shutdown without notice using R Error in installing H2O ai R package in BigInsights cluster in Bluemix Running H2O with Java 16 on R

Related Tags

Setting up H2O cluster while running several R programs (close to 20) that needs access to cluster

Question

2 answers

solution1
2 2020-01-01 12:19:40

solution2
1 ACCPTED 2020-01-02 12:11:53

Setting up H2O cluster while running several R programs (close to 20) that needs access to cluster

Question

2 answers

solution1 2 2020-01-01 12:19:40

solution2 1 ACCPTED 2020-01-02 12:11:53

solution1
2 2020-01-01 12:19:40

solution2
1 ACCPTED 2020-01-02 12:11:53