简体   繁体   中英

Cross-validate precision, recall and f1 together with sklearn

is there any simple way to cross-validate a classifier and calculate precision and recall at once? Currently I use the function

cross_validation.cross_val_score(classifier, designMatrix, classes, cv=5, scoring="precision")

however it calculates only one metric, so I have to call it 2 times to calculate precision and recall. With a large ML model, the calculation then unnecessarily takes 2 times longer. Is there any built-in better option, or do I have to implement the cross-validation on my own? thanks.

I am unsure of the current state of affairs (this feature has been discussed), but you can always get away with the following - awful - hack

from sklearn.metrics import recall_score, precision_score
from sklearn.metrics.scorer import make_scorer
recall_accumulator = []
def score_func(y_true, y_pred, **kwargs):
    recall_accumulator.append(recall_score(y_true, y_pred, **kwargs))
    return precision_score(y_true, y_pred, **kwargs)
scorer = make_scorer(score_func)

Then use scoring=scorer in your cross-validation. You should find the recall values in the recall_accumulator array. Watch out though, this array is global, so make sure you don't write to it in a way you can't interpret the results.

eickenberg's answer works when the argument n_job of cross_val_score() is set to 1. To support parallel computing ( n_jobs > 1), one have to use a shared list instead of a global list. This can be done with the help of Manager class from multiprocessing module.

from sklearn.metrics import precision_recall_fscore_support
from sklearn.metrics.scorer import make_scorer
from multiprocessing import Manager

recall_accumulator = Manager().list()
def score_func(y_true, y_pred, **kwargs):
    recall_accumulator.append(precision_recall_fscore_support(y_true, y_pred))
    return 0
scorer = make_scorer(score_func)

Then the result of each fold will be stored in recall_accumulator .

I also searched with the same question, so I'm leaving it for the next person.

You can use cross_validate . It can have multiple metric names in the scoring parameter.

scores = cross_validate(model, X, y, scoring=('precision','recall','f1'), cv=5)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM