简体繁体中英

Using cross-validation to determine weights of machine learning algorithms (GridSearchCv,RidgeCV,StackingClassifier)

原文 2020-07-15 19:44:02 1 1 python/ scikit-learn/ cross-validation/ gridsearchcv

My question has to do with GridSearchCV, RidgeCV, and StackingClassifier/Regressor.

Stacking Classifier/Regressor-AFAIK, it first trains the whole train set individually for each base estimator. Then, it uses a cross validation scheme, using the predictions for each base estimator as the new features to train the new final estimator. From the documentation: "To generalize and avoid over-fitting, the final_estimator is trained on out-samples using sklearn.model_selection.cross_val_predict internally."

My question is, what exactly does this mean? Does it break the train data into k folds, and then for each fold, train the final estimator on the training section of the fold, test it on the testing section of the fold, and then take the final estimator weights from the fold with the best score? or what?

I think I can group GridSearchCV and RidgeCV into the same question as they are quite similar. ( albeit, ridgeCV uses one vs all CV by default)

-To find the best hyperparameters, do they do a CV on all the folds, for each hyperparameter, find the hyperparameters that had the best average score AND THEN AFTER finding the best hyperparameters, train the model with the best hyperparameters, using the WHOLE training set? Or am I looking at it wrong?

If anyone could shed some light on this, that would be great. Thanks!

1 answers

You're exactly right. The process looks like this:

Select the first set of hyperparameters
Partition the data into k-folds
Run the model on each fold
Obtain the average score (loss, r2, or whatever specified criteria)
Repeat steps 2-4 for all other sets of hyperparameters
Choose the set of hyperparameters with the best score
Retrain the model on the entire dataset (as opposed to a single fold) using the best hyperparameters

Does GridSearchCV perform cross-validation?

Is there a way to see the folds for cross-validation in GridSearchCV?

How to test for overfitting in regression cross-validation with GridSearchCV?

Implementing GridSearchCV with scorer for Leave One Out Cross-Validation

How to give GridSearchCV a list of indicies for cross-validation?

Understanding Cross Validation for Machine learning

how does the cross-validation work in learning curve? Python sklearn

Nested cross-validation: How does cross_validate handle GridSearchCV as its input estimator?

Scikit Learn GridSearchCV without cross validation (unsupervised learning)

Cross-validation in LightGBM

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Does GridSearchCV perform cross-validation? Is there a way to see the folds for cross-validation in GridSearchCV? How to test for overfitting in regression cross-validation with GridSearchCV? Implementing GridSearchCV with scorer for Leave One Out Cross-Validation How to give GridSearchCV a list of indicies for cross-validation? Understanding Cross Validation for Machine learning how does the cross-validation work in learning curve? Python sklearn Nested cross-validation: How does cross_validate handle GridSearchCV as its input estimator? Scikit Learn GridSearchCV without cross validation (unsupervised learning) Cross-validation in LightGBM

Related Tags

Using cross-validation to determine weights of machine learning algorithms (GridSearchCv,RidgeCV,StackingClassifier)

Question

1 answers

solution1 1 2020-07-15 19:54:15

solution1
1 2020-07-15 19:54:15