簡體   English   中英

Xgboost Hyperparameter Tuning In R 用於二元分類

[英]Xgboost Hyperparameter Tuning In R for binary classification

我是 R 新手,正在嘗試對 xgboost-二進制分類進行超參數調整,但是我遇到了錯誤,如果有人可以幫助我,我將不勝感激

as.matrix(cv.res)[, 3] 中的錯誤:下標越界另外:警告消息:不推薦使用“early.stop.round”。 請改用“early_stopping_rounds”。 請參閱 help("Deprecated") 和 help("xgboost-deprecated")。

請在代碼片段下方找到`

 I would appreciate if some one could provide another alternative too apart from this approach in R X_Train <- as(X_train, "dgCMatrix") GS_LogLoss = data.frame("Rounds" = numeric(), "Depth" = numeric(), "r_sample" = numeric(), "c_sample" = numeric(), "minLogLoss" = numeric(), "best_round" = numeric()) for (rounds in seq(50,100, 25)) { for (depth in c(4, 6, 8, 10)) { for (r_sample in c(0.5, 0.75, 1)) { for (c_sample in c(0.4, 0.6, 0.8, 1)) { for (imb_scale_pos_weight in c(5, 10, 15, 20, 25)) { for (wt_gamma in c(5, 7, 10)) { for (wt_max_delta_step in c(5,7,10)) { for (wt_min_child_weight in c(5,7,10,15)) { set.seed(1024) eta_val = 2 / rounds cv.res = xgb.cv(data = X_Train, nfold = 2, label = y_train, nrounds = rounds, eta = eta_val, max_depth = depth, subsample = r_sample, colsample_bytree = c_sample, early.stop.round = 0.5*rounds, scale_pos_weight= imb_scale_pos_weight, max_delta_step = wt_max_delta_step, gamma = wt_gamma, objective='binary:logistic', eval_metric = 'auc', verbose = FALSE) print(paste(rounds, depth, r_sample, c_sample, min(as.matrix(cv.res)[,3]) )) GS_LogLoss[nrow(GS_LogLoss)+1, ] = c(rounds, depth, r_sample, c_sample, min(as.matrix(cv.res)[,3]), which.min(as.matrix(cv.res)[,3])) } } } } } } } }

`

做你的超參數的選擇,你可以使用元數據包tidymodels ,尤其是包parsniprsampleyardsticktune

像這樣的工作流程會起作用:

library(tidyverse)
library(tidymodels)

# Specify the model and the parameters to tune (parnsip)
model <-
  boost_tree(tree_depth = tune(), mtry = tune()) %>% 
  set_mode("classification") %>% 
  set_engine("xgboost")

# Specify the resampling method (rsample)
splits <- vfold_cv(X_train, v = 2)

# Specify the metrics to optimize (yardstick)
metrics <- metric_set(roc_auc)

# Specify the parameters grid (or you can use dials to automate your grid search)
grid <- expand_grid(tree_depth = c(4, 6, 8, 10),
                    mtry = c(2, 10, 50)) # You can add others

# Run each model (tune)
tuned <- tune_grid(formula = Y ~ .,
                   model = model,
                   resamples = splits,
                   grid = grid,
                   metrics = metrics,
                   control = control_grid(verbose = TRUE))

# Check results
show_best(tuned)
autoplot(tuned)
select_best(tuned)

# Update model
tuned_model <- 
  model %>% 
  finalize_model(select_best(tuned)) %>% 
  fit(Y ~ ., data = X_train)

# Make prediction 
predict(tuned_model, X_train)
predict(tuned_model, X_test)

請注意,與xgboost的原始名稱相比,模型規范中的名稱可能會發生變化,因為parsnip是一個統一的界面,具有跨多個模型的一致名稱。 這里

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM