简体   繁体   English

与 sklearn 一起交叉验证精度、召回率和 f1

[英]Cross-validate precision, recall and f1 together with sklearn

is there any simple way to cross-validate a classifier and calculate precision and recall at once?有没有简单的方法来交叉验证分类器并立即计算精度和召回率? Currently I use the function目前我使用该功能

cross_validation.cross_val_score(classifier, designMatrix, classes, cv=5, scoring="precision")

however it calculates only one metric, so I have to call it 2 times to calculate precision and recall.但是它只计算一个指标,所以我必须调用它 2 次来计算精度和召回率。 With a large ML model, the calculation then unnecessarily takes 2 times longer.对于大型 ML 模型,计算时间会不必要地延长 2 倍。 Is there any built-in better option, or do I have to implement the cross-validation on my own?有没有更好的内置选项,还是我必须自己实现交叉验证? thanks.谢谢。

I am unsure of the current state of affairs (this feature has been discussed), but you can always get away with the following - awful - hack我不确定当前的情况(这个功能已经讨论过了),但你总是可以摆脱以下 - 糟糕 - hack

from sklearn.metrics import recall_score, precision_score
from sklearn.metrics.scorer import make_scorer
recall_accumulator = []
def score_func(y_true, y_pred, **kwargs):
    recall_accumulator.append(recall_score(y_true, y_pred, **kwargs))
    return precision_score(y_true, y_pred, **kwargs)
scorer = make_scorer(score_func)

Then use scoring=scorer in your cross-validation.然后在交叉验证中使用scoring=scorer You should find the recall values in the recall_accumulator array.您应该在recall_accumulator数组中找到召回值。 Watch out though, this array is global, so make sure you don't write to it in a way you can't interpret the results.不过要注意,这个数组是全局的,所以请确保不要以无法解释结果的方式写入它。

eickenberg's answer works when the argument n_job of cross_val_score() is set to 1. To support parallel computing ( n_jobs > 1), one have to use a shared list instead of a global list.cross_val_score()的参数n_job设置为 1 时, eickenberg 的答案有效。为了支持并行计算( n_jobs > 1),必须使用共享列表而不是全局列表。 This can be done with the help of Manager class from multiprocessing module.这可以在多处理模块的Manager类的帮助下完成。

from sklearn.metrics import precision_recall_fscore_support
from sklearn.metrics.scorer import make_scorer
from multiprocessing import Manager

recall_accumulator = Manager().list()
def score_func(y_true, y_pred, **kwargs):
    recall_accumulator.append(precision_recall_fscore_support(y_true, y_pred))
    return 0
scorer = make_scorer(score_func)

Then the result of each fold will be stored in recall_accumulator .然后每个折叠的结果将存储在recall_accumulator中。

I also searched with the same question, so I'm leaving it for the next person.我也搜索过同样的问题,所以我把它留给下一个人。

You can use cross_validate .您可以使用cross_validate It can have multiple metric names in the scoring parameter.它可以在scoring参数中有多个指标名称。

scores = cross_validate(model, X, y, scoring=('precision','recall','f1'), cv=5)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何使用交叉验证在多类数据集中对精度、召回率和 f1-score 进行评分? - how to score precision, recall and f1-score in a multi-class dataset using cross-validate? 精度,召回率,F1得分与sklearn相等 - Precision, recall, F1 score equal with sklearn 如何获得sklearn的平均精度,召回率,f1,交叉验证的精度? - How to get average precision, recall, f1, accuracy of cross-validation with sklearn? Tensorflow:计算精度、召回率、F1 分数 - Tensorflow: Compute Precision, Recall, F1 Score 对多类问题执行 K 折交叉验证,评分 = 'f1 or Recall or Precision' - performing K-fold Cross Validation with scoring = 'f1 or Recall or Precision' for multi-class problem 如何计算 K 折交叉验证的不平衡数据集的精度、召回率和 f1 分数? - How to compute precision,recall and f1 score of an imbalanced dataset for K fold cross validation? F1小于Scikit学习中的精度和召回率 - F1 smaller than both precision and recall in Scikit-learn Tensorflow Precision,Recall,F1-多标签分类 - Tensorflow Precision, Recall, F1 - multi label classification 如何计算多类的精度、召回率和 f1 分数? 我们如何在交叉验证中使用 average='micro','macro' 等? - How to calculate precision ,recall & f1 score for multiclass? How can we use average='micro','macro' etc. in cross validation? 使用自制分离器获得精确度、召回率、F1 分数 - Get Precision, Recall, F1 Score with self-made splitter
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM