I know how to perform CV with basic utility functions such as cross_val_score
or cross_validate
in Sklearn.
However, I started using XGBoost, so to be able to use its fit
params, I have to cross-validate using the split
function of CV splitters. My question is, should I init a new model for each fold like this:
from sklearn.model_selection import KFold
import xgboost as xgb
cv = KFold(5)
for train_idx, test_idx in cv.split(X, y):
model = xgb.XGBRegressor()
model.fit(X[train_idx], y[train_idx], eval_metric='rmsle')
....
or init a single model outside the for loop like this:
cv = KFold(5)
model = xgb.XGBRegressor()
for train_idx, test_idx in cv.split(X, y):
model.fit(X[train_idx], y[train_idx], eval_metric='rmsle')
....
I already received the answer from someone else. Turns out, in each fold, you should initialize a new model.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.