I'm trying to optimize the parameters learning rate and max_depth of a XGB regression model:
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import cross_val_score
from xgboost import XGBRegressor
param_grid = [
# trying learning rates from 0.01 to 0.2
{'eta ':[0.01, 0.05, 0.1, 0.2]},
# and max depth from 4 to 10
{'max_depth': [4, 6, 8, 10]}
]
xgb_model = XGBRegressor(random_state = 0)
grid_search = GridSearchCV(xgb_model, param_grid, cv=5,
scoring='neg_root_mean_squared_error',
return_train_score=True)
grid_search.fit(final_OH_X_train_scaled, y_train)
final_OH_X_train_scaled
is the training dataset that contains only numerical features.
y_train
is the training label - also numerical.
This is returning the error:
FitFailedWarning: Estimator fit failed. The score on this train-test partition for these parameters will be set to nan.
I've seen other similar questions, but couldn't find an answer yet.
Also tried with:
param_grid = [
# trying learning rates from 0.01 to 0.2
# and max depth from 4 to 10
{'eta ': [0.01, 0.05, 0.1, 0.2], 'max_depth': [4, 6, 8, 10]}
]
But it generates the same error.
EDIT: Here's a sample of the data:
final_OH_X_train_scaled.head()
y_train.head()
EDIT2:
The data sample may be retrieved with:
final_OH_X_train_scaled = pd.DataFrame([[0.540617 ,1.204666 ,1.670791 ,-0.445424 ,-0.890944 ,-0.491098 ,0.094999 ,1.522411 ,-0.247443 ,-0.559572 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,1.0 ,0.0 ,0.0],
[0.117467 ,-2.351903 ,0.718969 ,-0.119721 ,-0.874705 ,-0.530832 ,-1.385230 ,2.126612 ,-0.947731 ,-0.156967 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,1.0 ,0.0 ,0.0 ,0.0 ,0.0],
[0.901138 ,-0.208256 ,-0.019134 ,0.265250 ,-0.889128 ,-0.467753 ,0.169306 ,-0.973256 ,0.056164 ,-0.671978 , 0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,1.0 ,0.0 ,0.0],
[2.074639 ,0.100602 ,-1.645121 ,0.929598 ,0.811911 ,1.364560 ,0.337242 ,0.435187 ,-0.388075 ,1.279959 , 0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,1.0],
[2.198099 ,-0.496254 ,-0.917933 ,-1.418407 ,-0.975889 ,1.044495 ,0.254181 ,1.335285 ,2.079415 ,2.071974 , 0.0 ,0.0 ,0.0 ,0.0 ,0.0 ,1.0 ,0.0 ,0.0 ,0.0 ,0.0]],
columns=['cont0' ,'cont1' ,'cont2' ,'cont3' ,'cont4' ,'cont5' ,'cont6' ,'cont7' ,'cont8' ,'cont9' ,'31' ,'32' ,'33' ,'34' ,'35' ,'36' ,'37' ,'38' ,'39' ,'40'])
I was able to reproduce the problem and the code fails to fit because there is an extra space in your eta
parameter! Instead of this:
{'eta ':[0.01, 0.05, 0.1, 0.2]},...
Change it to this:
{'eta':[0.01, 0.05, 0.1, 0.2]},...
The error message was unfortunately not very helpful.
Also for example, if for a LogisticRegression
you set the grid to sth like
grid_lr = {
'cls__class_weight': [None, 'balanced'],
'cls__C': [0, .001, .01, .1, 1]
}
You'll get a similar error; the reason being that C
could only take positive float values. Hence, simply double checking the naming or the values of the hyperparameters should be enough to resolve this issue.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.