sklearn 的 Plot 精度和召回率

Question

I have a created a classification model with a custom ML framework.我使用自定义 ML 框架创建了一个分类 model。

I have 3 classes: 1, 2, 3我有 3 个班级：1、2、3

Input sample:输入样本：

# y_true, y_pred, and y_scores are lists

print(y_true[0], y_pred[0], y_scores[0])
print(y_true[1], y_pred[1], y_scores[1])
print(y_true[2], y_pred[2], y_scores[2])

1 1 0.6903580037019461
3 3 0.8805178752523366
1 2 0.32107199420078963

Using sklearn I'm able to use: metrics.classification_report :使用 sklearn 我可以使用： metrics.classification_report ：

metrics.classification_report(y_true, y_pred)

                         precision    recall  f1-score   support

                      1      0.521     0.950     0.673        400
                      2      0.000     0.000     0.000        290
                      3      0.885     0.742     0.807        310

               accuracy                          0.610       1000
              macro avg      0.468     0.564     0.493       1000
           weighted avg      0.482     0.610     0.519       1000

I want to generate precision vs recall visualization.我想生成精确与召回可视化。

But I get this error:但我得到这个错误：

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-108-2ebb913a4e4b> in <module>()
----> 1 precision, recall, thresholds = metrics.precision_recall_curve(y_true, y_scores)

1 frames
/usr/local/lib/python3.6/dist-packages/sklearn/metrics/_ranking.py in _binary_clf_curve(y_true, y_score, pos_label, sample_weight)
    534     if not (y_type == "binary" or
    535             (y_type == "multiclass" and pos_label is not None)):
--> 536         raise ValueError("{0} format is not supported".format(y_type))
    537 
    538     check_consistent_length(y_true, y_score, sample_weight)

ValueError: multiclass format is not supported

I found some examples:我找到了一些例子：

But not very clear how to binarize my array if I already have the results, Looking for pointers how to simply plot it.但是如果我已经有了结果，还不是很清楚如何对我的数组进行二值化，寻找指针如何简单地 plot 它。

Answer 1

precision_recall_curve has a parameter pos_label , the label of the "positive" class for the purposes of TP/TN/FP/FN. pos_label precision_recall_curve即“正” class 的 label 用于 TP/TN/FP/FN。 So you can extract the relevant probability and then generate the precision/recall points as:因此，您可以提取相关概率，然后生成精度/召回点为：

y_pred = model.predict_proba(X)

index = 2  # or 0 or 1; maybe you want to loop?
label = model.classes_[index]  # see below
p, r, t = precision_recall_curve(y_true, y_pred[:, index], pos_label=label)

The main obnoxiousness here is that you need to extract the column of y_pred by index, but pos_label expects the actual class label.这里主要令人讨厌的是您需要按索引提取y_pred的列，但pos_label需要实际的 class label。 You can connect those using model.classes_ .您可以使用model.classes_连接那些。

It's probably also worth noting that the new plotting convenience function plot_precision_recall_curve doesn't work with this: it takes the model as a parameter, and breaks if it is not a binary classification.可能还值得注意的是，新的绘图便利 function plot_precision_recall_curve不适用于此：它将 model 作为参数，如果它不是二进制分类则中断。

sklearn 的 Plot 精度和召回率

问题描述

1 个解决方案

解决方案1
0 2021-02-03 03:11:55

sklearn 的 Plot 精度和召回率

问题描述

1 个解决方案

解决方案1 0 2021-02-03 03:11:55

解决方案1
0 2021-02-03 03:11:55