[英]Sklearn.metrics.classification_report Confusion Matrix Problem?
Firstly, thank you for reading my question - I hope this is the right place for this.首先,感谢您阅读我的问题 - 我希望这是正确的地方。
I am coding up sensitivity, specificity and precision calculations from a confusion matrix from scratch.我正在从头开始从混淆矩阵中编码敏感性、特异性和精度计算。 I have the following confusion matrix for 4 classes.
我有以下 4 个类的混淆矩阵。
True Class
1 2 3 4
1 [[ 0 1 3 0]
Predicted 2 [ 0 181 23 0]
Class 3 [ 0 17 53 14]
4 [ 0 3 22 77]]
When I use Sklearn.metrics.classification_report this is what I get:当我使用 Sklearn.metrics.classification_report 这就是我得到的:
precision recall f1-score support
0.00 0.00 0.00 4
0.89 0.89 0.89 204
0.52 0.63 0.57 84
0.85 0.75 0.80 102
However, for precision and recall I get (ie the values for precision and recall are flippped):但是,对于精度和召回率,我得到了(即精度和召回率的值被翻转):
precision recall
0.0 nan
0.887 0.896
0.631 0.524
0.755 0.846
For each class I calculate the following true positives, false positives, true negatives and false negatives:对于每个类别,我计算以下真阳性、假阳性、真阴性和假阴性:
class Tp Fp Tn Fn
1 0 4 390 0
2 181 23 169 21
3 53 31 262 48
4 77 25 278 14
The formulas that I'm using ( https://en.wikipedia.org/wiki/Confusion_matrix ) are:我使用的公式( https://en.wikipedia.org/wiki/Confusion_matrix )是:
sensitivity/recall = true_positives / (true_positives + false_negatives)
precision = true_positives/(true_positives+false_positives)
Where am I going wrong, surely sklearn's classification problem can't be the problem, am I mis-reading something?我哪里出错了,肯定sklearn的分类问题不能成为问题,我是不是看错了什么?
Edit: my function for calculating the precision and recall values given a confusion matrix from sklearn.metrics.confusion_matrix and a list of class numbers, for example for classes 1-3: [1, 2, 3] classes.编辑:根据 sklearn.metrics.confusion_matrix 的混淆矩阵和类编号列表计算精度和召回值的函数,例如类 1-3:[1, 2, 3] 类。
def calc_precision_recall(conf_matrix, class_labels):
# for each class
for i in range(len(class_labels)):
# calculate true positives
true_positives =(conf_matrix[i, i])
# false positives
false_positives = (conf_matrix[i, :].sum() - true_positives)
# false negatives
false_negatives = 0
for j in range(len(class_labels)):
false_negatives += conf_matrix[j, i]
false_negatives -= true_positives
# and finally true negatives
true_negatives = (conf_matrix.sum() - false_positives - false_negatives - true_positives)
# print calculated values
print(
"Class label", class_labels[i],
"T_positive", true_positives,
"F_positive", false_positives,
"T_negative", true_negatives,
"F_negative", false_negatives,
"\nSensitivity/recall", true_positives / (true_positives + false_negatives),
"Specificity", true_negatives / (true_negatives + false_positives),
"Precision", true_positives/(true_positives+false_positives), "\n"
)
return
Ok, where is your code?好的,你的代码在哪里? It's impossible to say for sure, when no one can see your code.
当没有人可以看到您的代码时,就不可能确定。 I'll just take a stab here...maybe your data is imbalanced.
我只是在这里试一下……也许你的数据不平衡。 Do you have more/less records in some feature columns?
您在某些特征列中是否有更多/更少的记录? Resample arrays or sparse matrices in a consistent way.
以一致的方式重新采样数组或稀疏矩阵。
This should run fine for you, right.这对你来说应该没问题,对吧。 Test it and see.
测试一下看看。
# Begin by importing all necessary libraries
import pandas as pd
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
from sklearn import datasets
# import some data to play with
iris = datasets.load_iris()
X = iris.data[:, 0:3] # we only take the first two features.
y = iris.target
# Now that we have the features and labels we want, we can split the data into training and testing sets using sklearn's handy feature train_test_split():
# Test size specifies how much of the data you want to set aside for the testing set.
# Random_state parameter is just a random seed we can use.
# You can use it if you'd like to reproduce these specific results.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=27)
# You may want to print the results to be sure your data is being parsed as you expect:
print(X_train)
print(y_train)
# Now we can instantiate the models. Let's try using two classifiers, a Support Vector Classifier and a K-Nearest Neighbors Classifier:
SVC_model = SVC()
# KNN model requires you to specify n_neighbors,
# the number of points the classifier will look at to determine what class a new point belongs to
KNN_model = KNeighborsClassifier(n_neighbors=5)
# Now let's fit the classifiers:
SVC_model.fit(X_train, y_train)
KNN_model.fit(X_train, y_train)
# The call has trained the model, so now we can predict and store the prediction in a variable:
SVC_prediction = SVC_model.predict(X_test)
KNN_prediction = KNN_model.predict(X_test)
#We should now evaluate how the classifier performed. There are multiple methods of evaluating a classifier's performance, and you can read more about there different methods below.
#In Scikit-Learn you just pass in the predictions against the ground truth labels which were stored in your test labels:
# Accuracy score is the simplest way to evaluate
print(accuracy_score(SVC_prediction, y_test))
print(accuracy_score(KNN_prediction, y_test))
# But Confusion Matrix and Classification Report give more details about performance
print(confusion_matrix(SVC_prediction, y_test))
print(classification_report(KNN_prediction, y_test))
Result:结果:
precision recall f1-score support
0 1.00 1.00 1.00 7
1 0.91 0.91 0.91 11
2 0.92 0.92 0.92 12
accuracy 0.93 30
macro avg 0.94 0.94 0.94 30
weighted avg 0.93 0.93 0.93 30
See the resources below.请参阅下面的资源。
https://www.kaggle.com/rafjaa/resampling-strategies-for-imbalanced-datasets https://www.kaggle.com/rafjaa/resampling-strategies-for-imbalanced-datasets
https://scikit-learn.org/stable/modules/generated/sklearn.utils.resample.html https://scikit-learn.org/stable/modules/generated/sklearn.utils.resample.html
Oh, and the X an y variables both have 150 records.哦,X 和 y 变量都有 150 条记录。
X.shape
y.shape
Result:结果:
X.shape
Out[107]: (150, 3)
y.shape
Out[108]: (150,)
I compared my returns per command and those made by hand and they both agree.我比较了每个命令的回报和手工制作的回报,他们都同意。 I imagine you are wrongly considering the values (or some of the values) of TP, FN, FP, TN.
我想您错误地考虑了 TP、FN、FP、TN 的值(或某些值)。 It may help to look at a graph:
查看图表可能会有所帮助:
(Image taken from the internet: https://www.stardat.net/post/confusion-matrix ) (图片来自网络: https : //www.stardat.net/post/confusion-matrix )
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.