如何在sklearn中获得用于二元分类的roc auc

Question

I have binary classification problem where I want to calculate the roc_auc of the results.我有一个二元分类问题，我想计算结果的 roc_auc。 For this purpose, I did it in two different ways using sklearn.为此，我使用 sklearn 以两种不同的方式完成了这项工作。 My code is as follows.我的代码如下。

Code 1:代码 1：

from sklearn.metrics import make_scorer
from sklearn.metrics import roc_auc_score

myscore = make_scorer(roc_auc_score, needs_proba=True)

from sklearn.model_selection import cross_validate
my_value = cross_validate(clf, X, y, cv=10, scoring = myscore)
print(np.mean(my_value['test_score'].tolist()))

I get the output as 0.60 .我得到的输出为0.60 。

Code 2:代码 2：

y_score = cross_val_predict(clf, X, y, cv=k_fold, method="predict_proba")

from sklearn.metrics import roc_curve, auc
fpr = dict()
tpr = dict()
roc_auc = dict()
for i in range(2):
    fpr[i], tpr[i], _ = roc_curve(y, y_score[:,i])
    roc_auc[i] = auc(fpr[i], tpr[i])
print(roc_auc)

I get the output as {0: 0.41, 1: 0.59} .我得到的输出为{0: 0.41, 1: 0.59} 。

I am confused since I get two different scores in the two codes.我很困惑，因为我在两个代码中得到了两个不同的分数。 Please let me know why this difference happens and what is the correct way of doing this.请让我知道为什么会发生这种差异以及这样做的正确方法是什么。

I am happy to provide more details if needed.如果需要，我很乐意提供更多详细信息。

Answer 1

It seems that you used a part of my code from another answer, so I though to also answer this question.看来你从另一个答案中使用了我的一部分代码，所以我也回答了这个问题。

For a binary classification case, you have 2 classes and one is the positive class.对于二元分类案例，您有 2 个类，一个是正类。

For example see here .例如，请参见此处。 pos_label is the label of the positive class. pos_label是正类的标签。 When pos_label=None , if y_true is in {-1, 1} or {0, 1} , pos_label is set to 1 , otherwise an error will be raised..当pos_label=None ，如果y_true在{-1, 1}或{0, 1} ，则pos_label设置为1 ，否则将引发错误..

import matplotlib.pyplot as plt
from sklearn import svm, datasets
from sklearn.metrics import roc_curve, auc
from sklearn.multiclass import OneVsRestClassifier
from sklearn.model_selection import cross_val_predict
from sklearn.linear_model import LogisticRegression
import numpy as np

iris = datasets.load_iris()
X = iris.data
y = iris.target
mask = (y!=2)
y = y[mask]
X = X[mask,:]
print(y)
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]

positive_class = 1

clf = OneVsRestClassifier(LogisticRegression())
y_score = cross_val_predict(clf, X, y, cv=10 , method='predict_proba')

fpr = dict()
tpr = dict()
roc_auc = dict()
fpr[positive_class], tpr[positive_class], _ = roc_curve(y, y_score[:, positive_class])
roc_auc[positive_class] = auc(fpr[positive_class], tpr[positive_class])
print(roc_auc)

{1: 1.0}

and和

from sklearn.metrics import make_scorer
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import cross_validate

myscore = make_scorer(roc_auc_score, needs_proba=True)

clf = OneVsRestClassifier(LogisticRegression())
my_value = cross_validate(clf, X, y, cv=10, scoring = myscore)
print(np.mean(my_value['test_score'].tolist()))
1.0

如何在sklearn中获得用于二元分类的roc auc

问题描述

1 个解决方案

解决方案1
4 已采纳 2020-03-25 14:26:38

如何在sklearn中获得用于二元分类的roc auc

问题描述

1 个解决方案

解决方案1 4 已采纳 2020-03-25 14:26:38

解决方案1
4 已采纳 2020-03-25 14:26:38