简体   繁体   中英

Release H2O Grid From Memory?

I am struggling to find the correct API for releasing memory for an object created by the H2O grid. This code was pre-written by someone else and I am currently maintaining it.

#train grid search
gbm_grid1 <- h2o.grid(algorithm = "gbm"                                  #specifies gbm algorithm is used
                      ,grid_id = paste("gbm_grid1",current_date,sep="_")   #defines a grid identification
                      ,x = predictors                                    #defines column variables to use as predictors
                      ,y = y                                             #specifies the response variable
                      ,training_frame = train1                           #specifies the training frame
                      
                      #gbm parameters to remain fixed
                      ,nfolds = 5                     #specify number of folds for cross-validation is 5 (this acceptable here in order to reduce training time)
                      ,distribution = "bernoulli"     #specify that we are predicting a binary dependent variable
                      ,ntrees = 1000                  #specify the number of trees to build (1000 as essentially the maximum number of trees that can be built. Early stopping parameters defined later will make it unlikely our model will reach 1000 trees)
                      ,learn_rate = 0.1               #specify the learn rate used of for gradient descent optimization (goal is to use as small a learn rate as possible)
                      ,learn_rate_annealing = 0.995   #specifies that the learn rate will perpetually decrease by a factor of 0.995 (this can help speed up traing for our grid search)
                      ,max_depth = tuned_max_depth
                      ,min_rows = tuned_min_rows
                      ,sample_rate = 0.8              #specify the amount of row observations used when making a split decision
                      ,col_sample_rate = 0.8          #specify the amount of column observations used when making a split decision
                      ,stopping_metric = "logloss"    #specify loss function
                      ,stopping_tolerance = 0.001     #specify minimum change required in stopping metric for individual model to continue training
                      ,stopping_rounds = 5            #specify maximum amount of training rounds stopping metric has to change in excess of stopping tolerance
                      
                      #specifies hyperparameters to fluctuate during model building in the grid search
                      ,hyper_params = gbm_hp2
                      
                      #specifies the search criteria that includes stop training etrics to speed up model building
                      ,search_criteria = search_criteria2
                      
                      #sets a reproducible seed
                      ,seed = 123456                         
)

h2o.rm(gbm_grid1)

The problem is I believe this code was written awhile ago and has been deprecated since. h2o.rm(gbm_grid1) fails and R Studio tells me that I require a hex identifier. So I assigned my object an identifier and tried h2o.rm(gbm_grid1, "identifier.hex") and it tells me I cannot release this type of object.

The issue is I run out of memory if I move onto the next steps of the script. What should I do?

This is what I get with H2O.ls()

在此处输入图像描述

Yes, you can remove objects with h2o.rm() . You can use the variable name or key.

h2o.rm(your_object)
h2o.rm(‘your_key’)

You can use h2o.ls() to check what objects are in memory. Also, you can add the argument cascade = TRUE to the rm method to remove sub-models.

See more here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM