简体   繁体   中英

Implementing GridSearchCV with scorer for Leave One Out Cross-Validation

I am attempting to implement scikit-learn's GridSearchCV for Gaussian Process Regression (GPR). I'm using a small dataset of ~200 points, and would like to use LOOCV as a performance evaluator for my model. My setup is:

from sklearn.model_selection import *
from sklearn.ensemble import *
from sklearn.gaussian_process import *

param_grid = {
    'kernel':[kernels.RBF(),kernels.Matern(length_scale=0.1)],
    'n_restarts_optimizer':[5,10,20,25],
    'random_state':[30]
}
res_GPR = GridSearchCV(estimator=GaussianProcessRegressor(),param_grid=param_grid,cv=LeaveOneOut(),verbose=20,n_jobs=-1)
res_GPR.fit(X,y)

where X and y are my data points and target values respectively. I know that the scoring method returned by GPR is r^2, which is undefinable for the LOOCV case (since there is only one test element) - this is verified by obtaining NaN for the .best_score_ attribute of the fitted model. As such, I would like the model to be scored with just the Root Mean Squared Error (RMSE) for each test case, averaged over all the iterations. Any inputs on how to implement this evaluation method would be greatly appreciated.

GridSearchCV includes a scoring argument, which you may use to set your score to negative RMSE:

res_GPR = GridSearchCV(estimator=GaussianProcessRegressor(),
                       param_grid=param_grid,
                       cv=LeaveOneOut(),
                       verbose=20,
                       n_jobs=-1, 
                       scoring = 'neg_root_mean_squared_error')

See the documentation and the list of available scores for more.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM