简体   繁体   English

我们如何在不使用助推器 object 的情况下使用 optuna 优化 XGBoost 超参数?

[英]How do we optimize XGBoost hyperparameters using optuna without using the booster object?

I am currently working on using XGBoost for prediction.我目前正在研究使用 XGBoost 进行预测。 I wish to know which group of hyperparameters would provide the best results.我想知道哪组超参数会提供最好的结果。 I have used optuna for the same but the prediction results seem to be out of line.我同样使用 optuna,但预测结果似乎不合时宜。

def objective(trial,data=X1,target=Y1):
    X1_train, X1_test, y1_train, y1_test = train_test_split(X1, Y1, test_size=0.2, random_state=100)
    param = {
        'tree_method':'exact',#Which one to use here : exact or approx?  
        'lambda': trial.suggest_loguniform('lambda', 1e-3,100.0),#What should be the range?
        'alpha': trial.suggest_loguniform('alpha', 1e-3,10.0),#What should be the range?
        'colsample_bytree': trial.suggest_categorical('colsample_bytree', [0.5,0.6,0.7,0.8,0.9,1.0]),
        'subsample': trial.suggest_categorical('subsample', [0.4,0.5,0.6,0.7,0.8,1.0]),
        'learning_rate': trial.suggest_categorical('learning_rate', [0.008,0.009,0.01,0.012,0.014,0.016,0.018, 0.02]),
        'n_estimators': 1000,
        'max_depth': trial.suggest_categorical('max_depth', [3,4,5,6,7,8,9,10]),
        'random_state': trial.suggest_categorical('random_state', [25,50,100]),#What should be the range?
        'min_child_weight': trial.suggest_int('min_child_weight', 1, 100),#What should be the range?
        'objective':'reg:squarederror'
            }
    eval_set = [(X1_train, y1_train), (X1_test, y1_test)]
    xg_reg1 = xgb.XGBRegressor(**param)  
    xg_reg1.fit(X1_train,y1_train, early_stopping_rounds=100, eval_metric=["rmse", "mae"], eval_set=eval_set, verbose=False)      
    preds = xg_reg1.predict(X1_test)    
    rmse = mean_squared_error(y1_test, preds,squared=False)    
    return rmse

The hyperparameters are optimized using Optuna as shown:超参数使用 Optuna 进行了优化,如下所示:

tudy = optuna.create_study(direction='minimize')
study.optimize(objective, n_trials=50)
print('Number of finished trials:', len(study.trials))
print('Best trial:', study.best_trial.params)

The parameters for best trials are used in XGBoost for prediction.最佳试验的参数在 XGBoost 中用于预测。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM