How does best estimator fitting work in RandomizedSearchCV?

Question

I used RandomizedSearchCV (RSCV) with the default 5-fold CV for LGBMClassifier with an evaluation set.

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
model_LGBM=LGBMClassifier(objective='binary',metric='auc',random_state=0,early_stopping_round=100)

distributions = dict(max_depth=range(1,10),
                     num_leaves=[50,100,150],
                     learning_rate=[0.1,0.2,0.3],
                     )

clf = RandomizedSearchCV(model_LGBM, distributions, random_state=0,n_iter=100,verbose=10)
clf.fit(X_train,y_train,eval_set=(X_test,y_test))

So the output of the RSCV looks like:

First iter: CV 1/5, "valid0's" CV 2/5 "valid0's", ..., CV 5/5 "valid0's";
Second iter: CV 1/5 "valid0's", CV 2/5 "valid0's", ..., CV 5/5 "valid0's";
...
Last iter: CV 1/5 "valid0's", CV 2/5 "valid0's", ..., CV 5/5 "valid0's";
+1 fit with "valid0's"

I suppose the last fit is the refitted best estimator. Does it use the whole training set? Where does it use the evaluation set?

Answer 1

根据文档（在此处提供），如果refit参数为True （默认情况下），则模型最终会使用输入的整个数据集（在本例中为训练数据）上找到的最佳参数进行训练。

How does best estimator fitting work in RandomizedSearchCV?

Question

1 answers

solution1
0 ACCPTED 2022-07-07 09:14:50

How does best estimator fitting work in RandomizedSearchCV?

Question

1 answers

solution1 0 ACCPTED 2022-07-07 09:14:50

solution1
0 ACCPTED 2022-07-07 09:14:50