I am doing multi-label classification and the evaluation is done by precision_recall_fscore_support
with average = 'samples'
:
predict = array(([1,1,0,0], [1,0,0,1], [1,1,1,1]))
expect = array(([1,1,1,1], [1,1,1,0], [1,0,0,0]))
smp_report = precision_recall_fscore_support(expect, predict, average = 'samples')
f_report = f1_score(expect, predict, average = 'samples')
There are three instances in this examples and the binary value stand for the existence of corresponding four classes.
then smp_report
and f_report
give me (0.58333333333333337, 0.61111111111111105, 0.48888888888888893, None)
and 0.488888888889
respectively.
The f-score is not equal to the result of 2*smp_report[0]*smp_report[1]/(smp_report[0]+smp_report[1])
which is the harmonic mean of precision and recall.
Could anyone tell me how does Sklearn implement this one? The version I am using is 0.15.0.
Scikit-learn first calculates the precision, recall, and harmonic F-measure for each item in your set of lists ( [1,1,0,0], [1,0,0,1], [1,1,1,1]
). It then calculates the mean of those precision values, the mean of those recall values, and the mean of those f-measures, and returns those mean values. These are the P, R, and F values you report above.
It is helpful to calculate the precision, recall, and f-measure values for a single item in your list. To calculate the P, R, and F values for the third item in your list, you can run:
import numpy as np
from sklearn import metrics
predict = np.array([[1,1,1,1]])
expect = np.array([[1,0,0,0]])
smp_report = metrics.precision_recall_fscore_support(expect, predict, beta=1, average = 'samples')
f_report = metrics.f1_score(expect, predict, average = 'samples')
print f_report, smp_report
Running this code gives you 0.4 (0.25, 1.0, 0.40000000000000002)
. The values inside the parenthetical indicate the precision, recall, and f-measure for the classification (in that order). As you can see, the f-measure is the harmonic mean between precision and recall:
2 * [(.25 * 1) / (.25 + 1) ] = .4
By swapping your first two lists into the code above, you can calculate the precision, recall, and harmonic f-measures for each of the three items in your data set:
first item values:
0.666666666667 (1.0, 0.5, 0.66666666666666663)
second item values
0.4 (0.5, 0.33333333333333331, 0.40000000000000002)
third item values
0.4 (0.25, 1.0, 0.40000000000000002)
SK then calculates the mean precision among these precision values, ie: 1 + .5 + .25 / 3 = .5833333333333333)
, the mean recall among these recall values, (.5 + .333 + 1 / 3 = 0.61111111111111105)
, and the mean f-measure among these f-measures (.666 + .4 + .4 / 3 = 0.48888888888888893)
, and returns those mean values. These are the values you report above. SK is calculating the harmonic mean for each classification event--it's simply returning the mean of those harmonic means.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.