[英]How to tell scikit-learn for which label the F-1/precision/recall score is given (in binary classification)?
As explained in this article , it matters for calculating the F-1 score (that is, for calculating recall and precision) whether those calculations are based on the positive or negative class. 如本文所述 ,计算F-1分数(即计算召回率和精确度)对于这些计算是基于正类还是负类很重要。 For example, if I have a skewed dataset with 1% labels of category A and 99% labels of category B and I am just assigning A the positive category and classify all test items as positive, my F-1 score will be very good.
例如,如果我有一个偏斜的数据集,其中1%的A类标签和99%的B类标签,我只是将A分类为正类别并将所有测试项目分类为正数,我的F-1分数将非常好。 How do I tell scikit-learn which category is the positive category in a binary classification?
如何告诉scikit-在二进制分类中了解哪个类别是正类别? (If helpful, I can provide code.)
(如果有帮助,我可以提供代码。)
For binary classification, sklearn.metrics.f1_score
will by default make the assumption that 1 is the positive class, and 0 is the negative class. 对于二进制分类,
sklearn.metrics.f1_score
默认情况下假设1是正类,0是负类。 If you use those conventions ( 0
for category B, and 1
for category A), it should give you the desired behavior. 如果您使用的这些公约(
0
为B类,和1
A类),它应该给你所期望的行为。 It is possible to override this behavior by passing the pos_label
keyword argument to the f1_score
function. 可以通过将
pos_label
关键字参数传递给f1_score
函数来覆盖此行为。
See: http://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html 请参阅: http : //scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.