简体   繁体   中英

Computing training score using cross_val_score

I am using cross_val_score to compute the mean score for a regressor. Here's a small snippet.

from sklearn.linear_model import LinearRegression
from sklearn.model_selection import cross_val_score 

cross_val_score(LinearRegression(), X, y_reg, cv = 5)

Using this I get an array of scores. I would like to know how the scores on the validation set (as returned in the array above) differ from those on the training set, to understand whether my model is over-fitting or under-fitting.

Is there a way of doing this with the cross_val_score object?

You can use cross_validate instead of cross_val_score
according to doc :

The cross_validate function differs from cross_val_score in two ways -

  • It allows specifying multiple metrics for evaluation.
  • It returns a dict containing training scores , fit-times and score-times in addition to the test score .

Why would you want that? cross_val_score(cv=5) does that for you as it splits your train data 10 times and verifies accuracy scores on 5 test subsets. This method already serves as a way to prevent your model from over-fitting.

Anyway, if you are eager to verify accuracy on your validation data, then you have to fit your LinearRegression first on X and y_reg .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM