简体   繁体   中英

Does h2o in a single node cluster do parallel processing or is it only in multi node cluster that parallel processing kicks in?

We are running h2o as a single node cluster inside in AWS:

R is connected to the H2O cluster: 
    H2O cluster uptime:         5 seconds 217 milliseconds 
    H2O cluster timezone:       Etc/UTC 
    H2O data parsing timezone:  UTC 
    H2O cluster version:        3.17.0.4153 
    H2O cluster version age:    10 months and 4 days !!! 
    H2O cluster name:           h2o-8ba55ebb-7d49-41bd-b4e2-d7be45b5f53e 
    H2O cluster total nodes:    1 
    H2O cluster total memory:   22.20 GB 
    H2O cluster total cores:    8 
    H2O cluster allowed cores:  8 
    H2O cluster healthy:        TRUE 
    H2O Connection ip:          localhost 
    H2O Connection port:        54321 
    H2O Connection proxy:       NA 
    H2O Internal Security:      FALSE 
    H2O API Extensions:         XGBoost, Algos, AutoML, Core V3, Core V4 
    R Version:                  R version 3.4.3 (2017-11-30) 

And starting h2o from java with nthreads -1:

java -ea -Xmx25g -jar /path/to/h2o.jar -name unique-cloud-name 
     -ip localhost -ice_root /tmp/h2o-tmp -nthreads -1

We're wondering if with a single node cluster that h2o is doing parallel processing / using all available and allowed cores. When we do top -H in the commandline we do see coincidentally 8 active java processes and wondering if those are from h2o and are helping generate our model.

在此处输入图片说明

Yes, H2O will use all the cores on a single node to train one model.

nthreads lets you explicitly set the thread pool size that controls the amount of parallelism per process.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM