Scikit的平均精度得分輸入形狀不良

Question

我正在嘗試繪制精度/召回力得分曲線。 這是我的代碼：

    lbl_enc = preprocessing.LabelEncoder()
    labels = lbl_enc.fit_transform(test_tags)

    y_score = clf.predict_proba(test_set)

    average_precision = average_precision_score(labels, y_score)
    print('Average precision-recall score: {0:0.2f}'.format(average_precision))

    precision, recall, _ = precision_recall_curve(labels, y_score)

    plt.step(recall, precision, color='b', alpha=0.2,
             where='post')
    plt.fill_between(recall, precision, step='post', alpha=0.2,
                     color='b')

    plt.xlabel('Recall')
    plt.ylabel('Precision')
    plt.ylim([0.0, 1.05])
    plt.xlim([0.0, 1.0])
    plt.title('2-class Precision-Recall curve: Average P-R = {0:0.2f}'.format(
        average_precision))

在計算average_precision_score的時候，我得到了由“ y_score”變量引起的“ ValueError：錯誤的輸入形狀（119，2）”。

y_score的格式如下：

array([[0.45953712, 0.54046288],
   [0.78289908, 0.21710092],
   [0.13488789, 0.86511211],
   [0.56162583, 0.43837417],
   (...)
   [0.4595595 , 0.5404405 ]])

標簽在其中：

array([0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
   1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
   1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
   1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
   1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
   1, 1, 1, 1, 1, 1, 1, 1, 1])

如何進行這項工作來計算平均精度得分？ 提前致謝。

Answer 1

在文檔中，它說：

y_score：數組，形狀= [n_samples]或[n_samples，n_classes]

目標分數可以是肯定類別的概率估計值，置信度值或決策的非閾值度量（如某些分類器上的“ decision_function”所返回）。

因此，我相信您只需要做：

average_precision  = average_precision_score(labels, y_score[:,1])

Scikit的平均精度得分輸入形狀不良

問題描述

1 個解決方案

解決方案1
2 已采納 2018-04-12 14:11:37

Scikit的平均精度得分輸入形狀不良

問題描述

1 個解決方案

解決方案1 2 已采納 2018-04-12 14:11:37

解決方案1
2 已采納 2018-04-12 14:11:37