简体   繁体   中英

Why sklearn returns the accuracy and weighted-average recall the same value in binary classification?

My problem is a binary classification where I use the following code to get the accuracy and weighted average recall .

from sklearn.ensemble import RandomForestClassifier
clf=RandomForestClassifier(random_state = 0, class_weight="balanced")

from sklearn.model_selection import cross_validate
cross_validate(clf, X, y, cv=10, scoring = ('accuracy', 'precision_weighted', 'recall_weighted', 'f1_weighted'))

I noted that the values of accuracy and weighted average recall are equal. However, as I understand these two metrics capture two different aspects and thus, I am not clear why they are exactly equal.

I found a post that have similar question: https://www.researchgate.net/post/Multiclass_classification_micro_weighted_recall_equals_accuracy . However, I did not found the answers of that post useful.

I am happy to provide more details if needed.

Accuracy is:

TP + TN / (P+ N)

So let's assume you have 50 positive classes and 50 negative, and somehow this is prediction 25 correct of your positive classes and 25 correct of your negativ classes, then:

25 + 25 / (50+50) = 0.5

Weighted average recall: First recall: TP/P = 25/50 = 0.5

Weighted recall:

(recall_posivite*number_positve)+(recall_negative*number_negative)/(number_positive + number_negativ) = 0.5*50+0.5*50/(50+50) = 50/100 = 0.5

I hope this helps to understand that it can happen!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM