In k-Fold cross-validation, is a new model initiated for each fold in Sklearn?

Question

I know how to perform CV with basic utility functions such as cross_val_score or cross_validate in Sklearn.

However, I started using XGBoost, so to be able to use its fit params, I have to cross-validate using the split function of CV splitters. My question is, should I init a new model for each fold like this:

from sklearn.model_selection import KFold
import xgboost as xgb

cv = KFold(5)

for train_idx, test_idx in cv.split(X, y):
    model = xgb.XGBRegressor()
    model.fit(X[train_idx], y[train_idx], eval_metric='rmsle')
    ....

or init a single model outside the for loop like this:

cv = KFold(5)

model = xgb.XGBRegressor()

for train_idx, test_idx in cv.split(X, y):
    model.fit(X[train_idx], y[train_idx], eval_metric='rmsle')
    ....

Answer 1

I already received the answer from someone else. Turns out, in each fold, you should initialize a new model.

In k-Fold cross-validation, is a new model initiated for each fold in Sklearn?

Question

1 answers

solution1
0 2021-07-27 10:10:14

In k-Fold cross-validation, is a new model initiated for each fold in Sklearn?

Question

1 answers

solution1 0 2021-07-27 10:10:14

solution1
0 2021-07-27 10:10:14