Computing training score using cross_val_score

Question

I am using cross_val_score to compute the mean score for a regressor. Here's a small snippet.

from sklearn.linear_model import LinearRegression
from sklearn.model_selection import cross_val_score 

cross_val_score(LinearRegression(), X, y_reg, cv = 5)

Using this I get an array of scores. I would like to know how the scores on the validation set (as returned in the array above) differ from those on the training set, to understand whether my model is over-fitting or under-fitting.

Is there a way of doing this with the cross_val_score object?

Answer 1

You can use cross_validate instead of cross_val_score
according to doc :

The cross_validate function differs from cross_val_score in two ways -

It allows specifying multiple metrics for evaluation.

It returns a dict containing training scores , fit-times and score-times in addition to the test score .

Answer 2

Why would you want that? cross_val_score(cv=5) does that for you as it splits your train data 10 times and verifies accuracy scores on 5 test subsets. This method already serves as a way to prevent your model from over-fitting.

Anyway, if you are eager to verify accuracy on your validation data, then you have to fit your LinearRegression first on X and y_reg .

Computing training score using cross_val_score

Question

2 answers

solution1
14 2017-09-30 08:47:05

solution2
-3 ACCPTED 2017-06-22 08:46:24

Computing training score using cross_val_score

Question

2 answers

solution1 14 2017-09-30 08:47:05

solution2 -3 ACCPTED 2017-06-22 08:46:24

solution1
14 2017-09-30 08:47:05

solution2
-3 ACCPTED 2017-06-22 08:46:24