繁体   English   中英

绘制多类 ROC 曲线

[英]Plotting Multiclass ROC Curve

我是 SciKit 和 Python 的新手。
目前,我正在尝试从 csv 文件生成多类(3 个类)ROC 曲线,如下所示:

  probability,predclass,dist0,dist1,dist2,actualclass
  99.94571208953857,1,0.00022618949060415616,99.94571208953857,0.054055178770795465,1
  99.99398589134216,0,99.99398589134216,0.001082851395040052,0.004925658140564337,0
  99.97997879981995,1,0.015142260235734284,99.97997879981995,0.004879535117652267,1
  93.58544945716858,2,5.507804825901985,0.9067309089004993,93.58544945716858,2
  92.31788516044617,1,7.572370767593384,92.31788516044617,0.10974484030157328,1
  62.839555740356445,1,2.3740695789456367,62.839555740356445,34.786370396614075,2
        ... 

我目前的代码是:

df = pd.read_csv('mydata.csv')
pred = (  df.loc[:,['dist0','dist1','dist2']])/100
actual = df['actualclass']
fpr, tpr, _ = roc_curve(actual, pred)

我已经尝试过这个问题的解决方案: https : //stackoverflow.com/a/45335434/14482749

通过做:

for i in range(n_classes):
    fpr[i], tpr[i], _ = roc_curve(actual, pred[:, i]) 
    roc_auc[i] = auc(fpr[i], tpr[i])

但收到错误:TypeError: '(slice(None, None, None), 0)' is an invalid key at line 2 above.

我相信问题是我的 var 'actual' 但我不确定它是什么

好吧,我想通了(我的方式)

所以我们需要将dist0..2actualclass输入到 roc_curve 函数中,例如:

for i in range(n_classes):
    fpr[i], tpr[i], _ = roc_curve(actual[:,i], pred[:, i])
    roc_auc[i] = auc(fpr[i], tpr[i])

所以我们可以对每 n 个类进行迭代

actual我做了:

actual2Dict = { 0: [1,0,0] , 1 : [0,1,0]  , 2 : [0,0,1] } 
actual2 = df['actualclass'].map(actual2Dict) 
 
actualnew = []
for i in actual2:
    i = np.array(i)
    actualnew.append(i)
    
actualnew = np.array(actualnew)

至于pred我做了:

pred = pred.to_numpy()

从那里我开始绘图:

for i in range(n_classes):
    fpr[i], tpr[i], _ = roc_curve(actualnew[:,i], pred[:, i])
    roc_auc[i] = auc(fpr[i], tpr[i])

lw = 2

# Compute micro-average ROC curve and ROC area
fpr["micro"], tpr["micro"], _ = roc_curve(actualnew.ravel(), pred.ravel())
roc_auc["micro"] = auc(fpr["micro"], tpr["micro"])

# First aggregate all false positive rates
all_fpr = np.unique(np.concatenate([fpr[i] for i in range(n_classes)]))

# Then interpolate all ROC curves at this points
mean_tpr = np.zeros_like(all_fpr)
for i in range(n_classes):
    mean_tpr += interp(all_fpr, fpr[i], tpr[i])

# Finally average it and compute AUC
mean_tpr /= n_classes

fpr["macro"] = all_fpr
tpr["macro"] = mean_tpr
roc_auc["macro"] = auc(fpr["macro"], tpr["macro"])
    
# Plot all ROC curves
plt.figure()
plt.plot(fpr["micro"], tpr["micro"],
         label='micro-average ROC curve (area = {0:0.2f})'
               ''.format(roc_auc["micro"]),
         color='deeppink', linestyle=':', linewidth=4)

plt.plot(fpr["macro"], tpr["macro"],
         label='macro-average ROC curve (area = {0:0.2f})'
               ''.format(roc_auc["macro"]),
         color='navy', linestyle=':', linewidth=4)

colors = cycle(['aqua', 'darkorange', 'cornflowerblue'])
for i, color in zip(range(n_classes), colors):
    plt.plot(fpr[i], tpr[i], color=color, lw=lw,
             label='ROC curve of class {0} (area = {1:0.2f})'
             ''.format(i, roc_auc[i]))

plt.plot([0, 1], [0, 1], 'k--', lw=lw)
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Some extension of Receiver operating characteristic to multi-class')
plt.legend(loc="lower right")
plt.show()

绘制多类roc_curve的代码摘录来自: https : roc_curve

如果您有不同的方法,请分享!

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM