基于示例的 f-score 小于 Sklearn 中的精度和召回率

Question

I am doing multi-label classification and the evaluation is done by precision_recall_fscore_support with average = 'samples' :我正在做多标签分类，评估是由precision_recall_fscore_support与average = 'samples' ：

predict = array(([1,1,0,0], [1,0,0,1], [1,1,1,1]))
expect = array(([1,1,1,1], [1,1,1,0], [1,0,0,0]))
smp_report = precision_recall_fscore_support(expect, predict, average = 'samples')
f_report = f1_score(expect, predict, average = 'samples')

There are three instances in this examples and the binary value stand for the existence of corresponding four classes.本例中有三个实例，二进制值代表对应的四个类的存在。

then smp_report and f_report give me (0.58333333333333337, 0.61111111111111105, 0.48888888888888893, None) and 0.488888888889 respectively.然后smp_report和f_report给我(0.58333333333333337, 0.61111111111111105, 0.48888888888888893, None)和0.488888888889 。

The f-score is not equal to the result of 2*smp_report[0]*smp_report[1]/(smp_report[0]+smp_report[1]) which is the harmonic mean of precision and recall. f-score 不等于2*smp_report[0]*smp_report[1]/(smp_report[0]+smp_report[1]) ，即精度和召回率的调和平均值。

Could anyone tell me how does Sklearn implement this one?谁能告诉我 Sklearn 是如何实现这个的？ The version I am using is 0.15.0.我使用的版本是 0.15.0。

Answer 1

Scikit-learn first calculates the precision, recall, and harmonic F-measure for each item in your set of lists ( [1,1,0,0], [1,0,0,1], [1,1,1,1] ). Scikit-learn 首先计算列表集中每个项目的准确率、召回率和谐波 F 度量 ( [1,1,0,0], [1,0,0,1], [1,1,1,1] )。 It then calculates the mean of those precision values, the mean of those recall values, and the mean of those f-measures, and returns those mean values.然后计算这些精度值的均值、这些召回值的均值以及这些 f 度量的均值，并返回这些均值。 These are the P, R, and F values you report above.这些是您在上面报告的 P、R 和 F 值。

It is helpful to calculate the precision, recall, and f-measure values for a single item in your list.计算列表中单个项目的精度、召回率和 f 度量值很有帮助。 To calculate the P, R, and F values for the third item in your list, you can run:要计算列表中第三项的 P、R 和 F 值，您可以运行：

import numpy as np
from sklearn import metrics

predict = np.array([[1,1,1,1]])
expect = np.array([[1,0,0,0]])
smp_report = metrics.precision_recall_fscore_support(expect, predict, beta=1, average = 'samples')
f_report = metrics.f1_score(expect, predict, average = 'samples')

print f_report, smp_report

Running this code gives you 0.4 (0.25, 1.0, 0.40000000000000002) .运行此代码为您提供0.4 (0.25, 1.0, 0.40000000000000002) 。 The values inside the parenthetical indicate the precision, recall, and f-measure for the classification (in that order).括号内的值表示分类的精度、召回率和 f 度量（按此顺序）。 As you can see, the f-measure is the harmonic mean between precision and recall:如您所见，f-measure是精度和召回率之间的调和平均值：

2 * [(.25 * 1) / (.25 + 1) ] = .4

By swapping your first two lists into the code above, you can calculate the precision, recall, and harmonic f-measures for each of the three items in your data set:通过将前两个列表交换到上面的代码中，您可以计算数据集中三个项目中每一个的精度、召回率和谐波 f 度量：

first item values:第一项值：

0.666666666667 (1.0, 0.5, 0.66666666666666663)

second item values第二项值

0.4 (0.5, 0.33333333333333331, 0.40000000000000002)

third item values第三项值

0.4 (0.25, 1.0, 0.40000000000000002)

SK then calculates the mean precision among these precision values, ie: 1 + .5 + .25 / 3 = .5833333333333333) , the mean recall among these recall values, (.5 + .333 + 1 / 3 = 0.61111111111111105) , and the mean f-measure among these f-measures (.666 + .4 + .4 / 3 = 0.48888888888888893) , and returns those mean values. SK 然后计算这些精度值之间的平均精度，即： 1 + .5 + .25 / 3 = .5833333333333333) ，这些召回值之间的平均召回率， (.5 + .333 + 1 / 3 = 0.61111111111111105)这些 f-measures 中的平均 f-measure (.666 + .4 + .4 / 3 = 0.48888888888888893) ，并返回这些平均值。 These are the values you report above.这些是您在上面报告的值。 SK is calculating the harmonic mean for each classification event--it's simply returning the mean of those harmonic means. SK 正在计算每个分类事件的调和平均值——它只是返回这些调和平均值的平均值。

基于示例的 f-score 小于 Sklearn 中的精度和召回率

问题描述

1 个解决方案

解决方案1
4 已采纳 2014-10-05 14:25:04

基于示例的 f-score 小于 Sklearn 中的精度和召回率

问题描述

1 个解决方案

解决方案1 4 已采纳 2014-10-05 14:25:04

解决方案1
4 已采纳 2014-10-05 14:25:04