[英]Plot precision and recall with sklearn
I have a created a classification model with a custom ML framework.我使用自定义 ML 框架创建了一个分类 model。
I have 3 classes: 1, 2, 3我有 3 个班级:1、2、3
Input sample:输入样本:
# y_true, y_pred, and y_scores are lists
print(y_true[0], y_pred[0], y_scores[0])
print(y_true[1], y_pred[1], y_scores[1])
print(y_true[2], y_pred[2], y_scores[2])
1 1 0.6903580037019461
3 3 0.8805178752523366
1 2 0.32107199420078963
Using sklearn I'm able to use: metrics.classification_report
:使用 sklearn 我可以使用:
metrics.classification_report
:
metrics.classification_report(y_true, y_pred)
precision recall f1-score support
1 0.521 0.950 0.673 400
2 0.000 0.000 0.000 290
3 0.885 0.742 0.807 310
accuracy 0.610 1000
macro avg 0.468 0.564 0.493 1000
weighted avg 0.482 0.610 0.519 1000
I want to generate precision vs recall visualization.我想生成精确与召回可视化。
But I get this error:但我得到这个错误:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-108-2ebb913a4e4b> in <module>()
----> 1 precision, recall, thresholds = metrics.precision_recall_curve(y_true, y_scores)
1 frames
/usr/local/lib/python3.6/dist-packages/sklearn/metrics/_ranking.py in _binary_clf_curve(y_true, y_score, pos_label, sample_weight)
534 if not (y_type == "binary" or
535 (y_type == "multiclass" and pos_label is not None)):
--> 536 raise ValueError("{0} format is not supported".format(y_type))
537
538 check_consistent_length(y_true, y_score, sample_weight)
ValueError: multiclass format is not supported
I found some examples:我找到了一些例子:
But not very clear how to binarize my array if I already have the results, Looking for pointers how to simply plot it.但是如果我已经有了结果,还不是很清楚如何对我的数组进行二值化,寻找指针如何简单地 plot 它。
precision_recall_curve
has a parameter pos_label
, the label of the "positive" class for the purposes of TP/TN/FP/FN. pos_label
precision_recall_curve
即“正” class 的 label 用于 TP/TN/FP/FN。 So you can extract the relevant probability and then generate the precision/recall points as:因此,您可以提取相关概率,然后生成精度/召回点为:
y_pred = model.predict_proba(X)
index = 2 # or 0 or 1; maybe you want to loop?
label = model.classes_[index] # see below
p, r, t = precision_recall_curve(y_true, y_pred[:, index], pos_label=label)
The main obnoxiousness here is that you need to extract the column of y_pred
by index, but pos_label
expects the actual class label.这里主要令人讨厌的是您需要按索引提取
y_pred
的列,但pos_label
需要实际的 class label。 You can connect those using model.classes_
.您可以使用
model.classes_
连接那些。
It's probably also worth noting that the new plotting convenience function plot_precision_recall_curve
doesn't work with this: it takes the model as a parameter, and breaks if it is not a binary classification.可能还值得注意的是,新的绘图便利 function
plot_precision_recall_curve
不适用于此:它将 model 作为参数,如果它不是二进制分类则中断。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.