简体   繁体   中英

Using cross-validation to determine weights of machine learning algorithms (GridSearchCv,RidgeCV,StackingClassifier)

My question has to do with GridSearchCV, RidgeCV, and StackingClassifier/Regressor.

  1. Stacking Classifier/Regressor-AFAIK, it first trains the whole train set individually for each base estimator. Then, it uses a cross validation scheme, using the predictions for each base estimator as the new features to train the new final estimator. From the documentation: "To generalize and avoid over-fitting, the final_estimator is trained on out-samples using sklearn.model_selection.cross_val_predict internally."

My question is, what exactly does this mean? Does it break the train data into k folds, and then for each fold, train the final estimator on the training section of the fold, test it on the testing section of the fold, and then take the final estimator weights from the fold with the best score? or what?

  1. I think I can group GridSearchCV and RidgeCV into the same question as they are quite similar. ( albeit, ridgeCV uses one vs all CV by default)

-To find the best hyperparameters, do they do a CV on all the folds, for each hyperparameter, find the hyperparameters that had the best average score AND THEN AFTER finding the best hyperparameters, train the model with the best hyperparameters, using the WHOLE training set? Or am I looking at it wrong?

If anyone could shed some light on this, that would be great. Thanks!

You're exactly right. The process looks like this:

  1. Select the first set of hyperparameters
  2. Partition the data into k-folds
  3. Run the model on each fold
  4. Obtain the average score (loss, r2, or whatever specified criteria)
  5. Repeat steps 2-4 for all other sets of hyperparameters
  6. Choose the set of hyperparameters with the best score
  7. Retrain the model on the entire dataset (as opposed to a single fold) using the best hyperparameters

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM