简体   繁体   中英

Tidymodels Tuning Recipe Parameters

Using tidymodels, I really love the possibility of tuning not only model parameters, but also some recipes steps. For example the number of components in step_pls(). The issue is that I'm finding trouble in limiting the range of possible values. For example, if I want to use step_umap I would like to limit the search space to 2:5 components. When I replace step_pls() by step_umap(), the following code causes the session to crash. It tries to build umap with around 50 components... So basically, my question is, while using grid_random or grid_max_entropy, how can I limit the range of search for a specific tuning parameter?

Note: also tried something like param_grid%>%grid_random(size=5,num_comp() %>% range_set(c(3, 5))) . But seems to be ignored.

Thanks

 # Load Packages ----------------------------------------------------------- require(tidyverse) require(lubridate) require(tidymodels) require(rsample) require(themis) require(recipes) require(embed) # Load Data --------------------------------------------------------------- data<-read_csv("....data.csv") # Modelling - Data Partition ---------------------------------------------- split_prop <- 0.80 init_split <- initial_time_split(data, prop = split_prop) set_train<-training(init_split) set_test<-testing(init_split) # Modelling - Resamples --------------------------------------------------- valid_folds <- rsample::vfold_cv(set_train,v=5) # Modelling - Data Transf ------------------------------------------------- recip_train <- recipe(label ~., data = set_train)%>% step_normalize(all_predictors())%>% step_pls(all_predictors(),outcome = "label",num_comp = tune()) # Modelling - Model Specs --------------------------------------------------- model_glm <- linear_reg()%>% set_args(penalty=tune(), mixture=tune())%>% set_mode("regression") %>% set_engine("glmnet") # Workflow ------------------------------------------------------------------ wflw <- workflow() %>% add_recipe(recip_train) %>% add_model(model_glm) # Modelling - Tuning Control ------------------------------------------------- ctr_tune <- control_grid( verbose = TRUE, allow_par = TRUE, extract = NULL, save_pred = TRUE, pkgs = NULL ) param_grid<-wflw %>% parameters()%>% finalize(set_train)%>% grid_max_entropy(size = 5) # Modelling - Tuning --------------------------------------------------------- tuning <- tune_grid(object = wflw, resamples = valid_folds, grid = param_grid, control = ctr_tune, metrics = metric_set(rmse))

If you have a specific range for num_comp that you want to try out, I wouldn't bother with going to the workflow and getting the parameters, etc. I would set up the tuning grid with the parameters directly:

library(dials)
#> Loading required package: scales
grid_max_entropy(penalty(),
                 mixture(),
                 num_comp(range = c(2, 5)),
                 size = 5)
#> # A tibble: 5 x 3
#>         penalty mixture num_comp
#>           <dbl>   <dbl>    <int>
#> 1 0.00161        0.721         5
#> 2 0.751          0.376         4
#> 3 0.00000000974  0.395         3
#> 4 0.000107       0.0747        4
#> 5 0.0000000451   0.906         3

Created on 2020-07-19 by the reprex package (v0.3.0)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM