简体   繁体   中英

FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan

I'm trying to optimize the parameters learning rate and max_depth of a XGB regression model:

from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import cross_val_score
from xgboost import XGBRegressor

param_grid = [
    # trying learning rates from 0.01 to 0.2
    {'eta ':[0.01, 0.05, 0.1, 0.2]},
    # and max depth from 4 to 10
    {'max_depth': [4, 6, 8, 10]}
  ]

xgb_model = XGBRegressor(random_state = 0)
grid_search = GridSearchCV(xgb_model, param_grid, cv=5,
                           scoring='neg_root_mean_squared_error',
                           return_train_score=True)

grid_search.fit(final_OH_X_train_scaled, y_train)

final_OH_X_train_scaled is the training dataset that contains only numerical features.

y_train is the training label - also numerical.

This is returning the error:

FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan.

I've seen other similar questions, but couldn't find an answer yet.

Also tried with:

param_grid = [
    # trying learning rates from 0.01 to 0.2
    # and max depth from 4 to 10
    {'eta ': [0.01, 0.05, 0.1, 0.2], 'max_depth': [4, 6, 8, 10]}   
  ]

But it generates the same error.

EDIT: Here's a sample of the data:

final_OH_X_train_scaled.head()

在此处输入图片说明

y_train.head()

在此处输入图片说明

EDIT2:

The data sample may be retrieved with:

final_OH_X_train_scaled = pd.DataFrame([[0.540617 ,1.204666 ,1.670791 ,-0.445424 ,-0.890944 ,-0.491098 ,0.094999 ,1.522411 ,-0.247443 ,-0.559572 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,1.0 ,0.0 ,0.0], 
                   [0.117467 ,-2.351903 ,0.718969 ,-0.119721 ,-0.874705 ,-0.530832 ,-1.385230 ,2.126612 ,-0.947731 ,-0.156967 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,1.0 ,0.0 ,0.0 ,0.0 ,0.0], 
                   [0.901138 ,-0.208256 ,-0.019134 ,0.265250 ,-0.889128 ,-0.467753 ,0.169306 ,-0.973256 ,0.056164 ,-0.671978 , 0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,1.0 ,0.0 ,0.0],
                   [2.074639 ,0.100602 ,-1.645121 ,0.929598 ,0.811911 ,1.364560 ,0.337242 ,0.435187 ,-0.388075 ,1.279959 , 0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,1.0], 
                   [2.198099 ,-0.496254 ,-0.917933 ,-1.418407 ,-0.975889 ,1.044495 ,0.254181 ,1.335285 ,2.079415 ,2.071974 , 0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,1.0 ,0.0 ,0.0 ,0.0 ,0.0]],
                  columns=['cont0' ,'cont1' ,'cont2' ,'cont3' ,'cont4' ,'cont5' ,'cont6' ,'cont7' ,'cont8' ,'cont9' ,'31' ,'32' ,'33' ,'34' ,'35' ,'36' ,'37' ,'38' ,'39' ,'40'])

I was able to reproduce the problem and the code fails to fit because there is an extra space in your eta parameter! Instead of this:

{'eta ':[0.01, 0.05, 0.1, 0.2]},...

Change it to this:

{'eta':[0.01, 0.05, 0.1, 0.2]},...

The error message was unfortunately not very helpful.

Also for example, if for a LogisticRegression you set the grid to sth like

grid_lr = {
'cls__class_weight': [None, 'balanced'],
'cls__C': [0, .001, .01, .1, 1]
}

You'll get a similar error; the reason being that C could only take positive float values. Hence, simply double checking the naming or the values of the hyperparameters should be enough to resolve this issue.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM