[英]Computing training score using cross_val_score
I am using cross_val_score
to compute the mean score for a regressor.我正在使用
cross_val_score
来计算回归量的平均分数。 Here's a small snippet.这是一个小片段。
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import cross_val_score
cross_val_score(LinearRegression(), X, y_reg, cv = 5)
Using this I get an array of scores.使用它,我得到了一系列分数。 I would like to know how the scores on the validation set (as returned in the array above) differ from those on the training set, to understand whether my model is over-fitting or under-fitting.
我想知道验证集上的分数(如上面数组中返回的)与训练集上的分数有何不同,以了解我的模型是过拟合还是欠拟合。
Is there a way of doing this with the cross_val_score
object?有没有办法用
cross_val_score
对象做到这一点?
You can use cross_validate
instead of cross_val_score
您可以使用
cross_validate
而不是cross_val_score
according to doc :根据文档:
The
cross_validate
function differs fromcross_val_score
in two ways -cross_validate
函数在两个方面与cross_val_score
不同 -
- It allows specifying multiple metrics for evaluation.
它允许指定多个评估指标。
- It returns a dict containing training scores , fit-times and score-times in addition to the test score .
除了测试分数之外,它还返回一个包含训练分数、拟合时间和分数时间的字典。
Why would you want that?你为什么要那样?
cross_val_score(cv=5)
does that for you as it splits your train data 10 times and verifies accuracy scores on 5 test subsets. cross_val_score(cv=5)
会为您执行此操作,因为它将您的训练数据拆分 10 次并验证 5 个测试子集的准确度分数。 This method already serves as a way to prevent your model from over-fitting.这种方法已经可以作为一种防止模型过度拟合的方法。
Anyway, if you are eager to verify accuracy on your validation data, then you have to fit your LinearRegression first on X and y_reg .无论如何,如果您渴望验证验证数据的准确性,那么您必须首先在X和y_reg上拟合LinearRegression 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.