简体   繁体   中英

How does best estimator fitting work in RandomizedSearchCV?

I used RandomizedSearchCV (RSCV) with the default 5-fold CV for LGBMClassifier with an evaluation set.

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
model_LGBM=LGBMClassifier(objective='binary',metric='auc',random_state=0,early_stopping_round=100)

distributions = dict(max_depth=range(1,10),
                     num_leaves=[50,100,150],
                     learning_rate=[0.1,0.2,0.3],
                     )

clf = RandomizedSearchCV(model_LGBM, distributions, random_state=0,n_iter=100,verbose=10)
clf.fit(X_train,y_train,eval_set=(X_test,y_test))

So the output of the RSCV looks like:

First iter: CV 1/5, "valid0's" CV 2/5 "valid0's", ..., CV 5/5 "valid0's";
Second iter: CV 1/5 "valid0's", CV 2/5 "valid0's", ..., CV 5/5 "valid0's";
...
Last iter: CV 1/5 "valid0's", CV 2/5 "valid0's", ..., CV 5/5 "valid0's";
+1 fit with "valid0's"

I suppose the last fit is the refitted best estimator. Does it use the whole training set? Where does it use the evaluation set?

根据文档(在此处提供),如果refit参数为True (默认情况下),则模型最终会使用输入的整个数据集(在本例中为训练数据)上找到的最佳参数进行训练。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM