[英]classification_report output with missing accuracy data
I'm taking a course and doing some examples my output comes wrong.我正在参加一门课程并做了一些示例,我的输出出错了。
import pandas as pd
df = pd.read_csv(r'E:\Python Projects\Python-Data-Science-and-Machine-Learning-Bootcamp\Machine Learning\Árvores de decisão e Florestas Aleatórias\kyphosis.csv')
from sklearn.model_selection import train_test_split
x = df.drop('Kyphosis', axis=1)
y = df['Kyphosis']
X_train, X_test, y_train, y_test = train_test_split(x,y,test_size=0.33)
from sklearn.tree import DecisionTreeClassifier
dtree = DecisionTreeClassifier()
dtree.fit(X_train, y_train)
pred = dtree.predict(X_test)
from sklearn.metrics import classification_report
print(classification_report(y_test, pred))
This is how classification_report returns the text summary, nothing is missing.这就是classification_report 返回文本摘要的方式,没有遗漏任何内容。
Look into the documentation: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.classification_report.html查看文档: https : //scikit-learn.org/stable/modules/generated/sklearn.metrics.classification_report.html
>>> from sklearn.metrics import classification_report
>>> y_true = [0, 1, 2, 2, 2]
>>> y_pred = [0, 0, 2, 2, 1]
>>> target_names = ['class 0', 'class 1', 'class 2']
>>> print(classification_report(y_true, y_pred, target_names=target_names))
precision recall f1-score support
<BLANKLINE>
class 0 0.50 1.00 0.67 1
class 1 0.00 0.00 0.00 1
class 2 1.00 0.67 0.80 3
<BLANKLINE>
accuracy 0.60 5
macro avg 0.50 0.56 0.49 5
weighted avg 0.70 0.60 0.61 5
<BLANKLINE>
>>> y_pred = [1, 1, 0]
>>> y_true = [1, 1, 1]
>>> print(classification_report(y_true, y_pred, labels=[1, 2, 3]))
precision recall f1-score support
<BLANKLINE>
1 1.00 0.67 0.80 3
2 0.00 0.00 0.00 0
3 0.00 0.00 0.00 0
<BLANKLINE>
micro avg 1.00 0.67 0.80 3
macro avg 0.33 0.22 0.27 3
weighted avg 1.00 0.67 0.80 3
<BLANKLINE>
The reported averages include macro average (averaging the unweighted mean per label), weighted average (averaging the support-weighted mean per label), and sample average (only for multilabel classification).报告的平均值包括宏观平均值(平均每个标签的未加权平均值)、加权平均值(平均每个标签的支持加权平均值)和样本平均值(仅用于多标签分类)。 Micro average (averaging the total true positives, false negatives and false positives) is only shown for multi-label or multi-class with a subset of classes, because it corresponds to accuracy otherwise.微观平均值(平均总真阳性、假阴性和假阳性)仅针对多标签或具有类子集的多类显示,因为否则它对应于准确性。
Your accuracy is simply 74%.您的准确率仅为 74%。
Your classification report is not missing anything;你的分类报告没有遗漏任何东西; it is a peculiarity of scikit-learn that it chooses to display the accuracy there, but there is no "precision accuracy" or "recall accuracy". scikit-learn 的一个特点是它选择在那里显示准确度,但没有“精确准确度”或“召回准确度”。 Your actual accuracy is what is shown under the f1-score
column;您的实际准确度显示在f1-score
列下; here is an example with toy data from the documentation :这是文档中的玩具数据示例:
from sklearn.metrics import classification_report
y_true = [0, 1, 2, 2, 2]
y_pred = [0, 0, 2, 2, 1]
target_names = ['class 0', 'class 1', 'class 2']
print(classification_report(y_true, y_pred, target_names=target_names))
Result:结果:
precision recall f1-score support
class 0 0.50 1.00 0.67 1
class 1 0.00 0.00 0.00 1
class 2 1.00 0.67 0.80 3
accuracy 0.60 5
macro avg 0.50 0.56 0.49 5
weighted avg 0.70 0.60 0.61 5
ie the accuracy here is 0.6, something that you can directly verify:即这里的准确度是 0.6,你可以直接验证:
from sklearn.metrics import accuracy_score
accuracy_score(y_true, y_pred)
# 0.6
You are right that it's odd, though, and it can certainly be confusing.不过,您说得对,这很奇怪,而且肯定会令人困惑。 Not a great design choice...不是一个很好的设计选择......
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.