缺少准確度數據的分類報告輸出

Question

我正在參加一門課程並做了一些示例，我的輸出出錯了。

import pandas as pd 

df = pd.read_csv(r'E:\Python Projects\Python-Data-Science-and-Machine-Learning-Bootcamp\Machine Learning\Árvores de decisão e Florestas Aleatórias\kyphosis.csv')

from sklearn.model_selection import train_test_split

x = df.drop('Kyphosis', axis=1)
y = df['Kyphosis']

X_train, X_test, y_train, y_test = train_test_split(x,y,test_size=0.33)

from sklearn.tree import DecisionTreeClassifier

dtree = DecisionTreeClassifier()
dtree.fit(X_train, y_train)
pred = dtree.predict(X_test)

from sklearn.metrics import classification_report

print(classification_report(y_test, pred))

缺少這2個數據

Answer 1

這就是classification_report 返回文本摘要的方式，沒有遺漏任何內容。

查看文檔： https : //scikit-learn.org/stable/modules/generated/sklearn.metrics.classification_report.html

>>> from sklearn.metrics import classification_report
>>> y_true = [0, 1, 2, 2, 2]
>>> y_pred = [0, 0, 2, 2, 1]
>>> target_names = ['class 0', 'class 1', 'class 2']
>>> print(classification_report(y_true, y_pred, target_names=target_names))
              precision    recall  f1-score   support
<BLANKLINE>
     class 0       0.50      1.00      0.67         1
     class 1       0.00      0.00      0.00         1
     class 2       1.00      0.67      0.80         3
<BLANKLINE>
    accuracy                           0.60         5
   macro avg       0.50      0.56      0.49         5
weighted avg       0.70      0.60      0.61         5
<BLANKLINE>
>>> y_pred = [1, 1, 0]
>>> y_true = [1, 1, 1]
>>> print(classification_report(y_true, y_pred, labels=[1, 2, 3]))
              precision    recall  f1-score   support
<BLANKLINE>
           1       1.00      0.67      0.80         3
           2       0.00      0.00      0.00         0
           3       0.00      0.00      0.00         0
<BLANKLINE>
   micro avg       1.00      0.67      0.80         3
   macro avg       0.33      0.22      0.27         3
weighted avg       1.00      0.67      0.80         3
<BLANKLINE>

報告的平均值包括宏觀平均值（平均每個標簽的未加權平均值）、加權平均值（平均每個標簽的支持加權平均值）和樣本平均值（僅用於多標簽分類）。 微觀平均值（平均總真陽性、假陰性和假陽性）僅針對多標簽或具有類子集的多類顯示，因為否則它對應於准確性。

您的准確率僅為 74%。

Answer 2

你的分類報告沒有遺漏任何東西； scikit-learn 的一個特點是它選擇在那里顯示准確度，但沒有“精確准確度”或“召回准確度”。 您的實際准確度顯示在f1-score列下； 這是文檔中的玩具數據示例：

from sklearn.metrics import classification_report
y_true = [0, 1, 2, 2, 2]
y_pred = [0, 0, 2, 2, 1]
target_names = ['class 0', 'class 1', 'class 2']
print(classification_report(y_true, y_pred, target_names=target_names))

結果：

              precision    recall  f1-score   support

     class 0       0.50      1.00      0.67         1
     class 1       0.00      0.00      0.00         1
     class 2       1.00      0.67      0.80         3

    accuracy                           0.60         5
   macro avg       0.50      0.56      0.49         5
weighted avg       0.70      0.60      0.61         5

即這里的准確度是 0.6，你可以直接驗證：

from sklearn.metrics import accuracy_score
accuracy_score(y_true, y_pred)
# 0.6

不過，您說得對，這很奇怪，而且肯定會令人困惑。 不是一個很好的設計選擇......

缺少准確度數據的分類報告輸出

問題描述

2 個解決方案

解決方案1
1 已采納 2020-03-26 21:52:34

解決方案2
1 2020-03-26 22:01:21

缺少准確度數據的分類報告輸出

問題描述

2 個解決方案

解決方案1 1 已采納 2020-03-26 21:52:34

解決方案2 1 2020-03-26 22:01:21

解決方案1
1 已采納 2020-03-26 21:52:34

解決方案2
1 2020-03-26 22:01:21