简体   繁体   English

使用Sklearn在ROC曲线下的面积?

[英]Area under the ROC curve using Sklearn?

I can't figure out why the Sklearn function roc_auc_score returns 1 in the following case: 我无法弄清楚为什么在以下情况下Sklearn函数roc_auc_score返回1

y_true = [0, 0, 1, 0, 0, 0, 0, 1]

y_scores = [0.18101096153259277, 0.15506085753440857, 
            0.9940806031227112, 0.05024950951337814, 
            0.7381414771080017, 0.8922111988067627, 
            0.8253260850906372, 0.9967281818389893]

roc_auc_score(y_true,y_scores)

The three scores 0.7381414771080017, 0.8922111988067627, 0.8253260850906372 at the end don't match the labels 0, 0, 0 . 最后的三个分数0.7381414771080017, 0.8922111988067627, 0.8253260850906372与标签0, 0, 0不匹配。 So, how can the AUC be 1? 那么,AUC如何为1? What am I getting wrong here? 我这是怎么了?

The auc of ROC curve just measures the ability of your model to rank order the datapoints, with respect to your positive class. ROC曲线的auc只是衡量模型相对于肯定类对数据点进行排序的能力。

In your example, the score of the positive class is always greater than the negative class datapoints. 在您的示例中,正面类别的得分始终大于负面类别的数据点。 Hence, the auc_roc_score of 1 is correct. 因此,auc_roc_score为1是正确的。

pd.DataFrame({'y_true':y_true,'y_scores':y_scores}).sort_values('y_scores',ascending=False)

    y_scores    y_true
7   0.996728    1
2   0.994081    1
5   0.892211    0
6   0.825326    0
4   0.738141    0
0   0.181011    0
1   0.155061    0
3   0.050250    0

If you look at the ROC itself, it's easier to see why: 如果您看一下ROC本身,就更容易理解为什么:

> roc_curve(y_true, y_scores)

(array([0., 0., 0., 1.]),
 array([0. , 0.5, 1. , 1. ]),
 array([1.99672818, 0.99672818, 0.9940806 , 0.05024951]))

The first value in the returned tuple is the FPR, the second is the TPR, and the third is the threshold points where the value changes. 返回的元组中的第一个值是FPR,第二个是TPR,第三个是值改变的阈值点。

For a threshold of 0.99672818, the FPR is indeed 0.5, and not 0, which would cause you to think that the AUC of the ROC is not 0. However, the FPR/TPR points are just the lines 0, 0 -> 0, 1 -> 1, 1, and the area beneath that is indeed 1. 对于0.99672818的阈值,FPR实际上为0.5,而不是0,这会使您认为ROC的AUC不为0。但是,FPR / TPR点只是行0、0-> 0, 1-> 1、1,并且其下方的区域确实为1。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM