为什么在多类分类问题中，二进制准确度的准确度高，而分类准确度的准确度低？

Question

I'm working on a multiclass classification problem using Keras and I'm using binary accuracy and categorical accuracy as metrics.我正在使用 Keras 处理多类分类问题，并使用二进制准确度和分类准确度作为指标。 When I evaluate my model I get a really high value for the binary accuracy and quite a low one in for the categorical accuracy.当我评估我的模型时，我得到了一个非常高的二进制精度值和一个非常低的分类精度值。 I tried to recreate the binary accuracy metric in my own code but I am not having much luck.我试图在我自己的代码中重新创建二进制精度指标，但我运气不佳。 My understanding is that this is the process I need to recreate:我的理解是，这是我需要重新创建的过程：

def binary_accuracy(y_true, y_pred):
     return K.mean(K.equal(y_true, K.round(y_pred)), axis=-1)

Here is my code:这是我的代码：

from keras import backend as K
preds = model.predict(X_test, batch_size = 128)

print preds
pos = 0.00
neg = 0.00

for i, val in enumerate(roundpreds):

    if val.tolist() == y_test[i]:
        pos += 1.0

    else: 
        neg += 1.0

print pos/(pos + neg)

But this gives a much lower value than the one given by binary accuracy.但这给出的值比二进制精度给出的值低得多。 Is binary accuracy even an appropriate metric to be using in a multi-class problem?二进制精度甚至是在多类问题中使用的合适指标吗？ If so does anyone know where I am going wrong?如果是这样，有人知道我哪里出错了吗？

Answer 1

So you need to understand what happens when you apply a binary_crossentropy to a multiclass prediction.因此，您需要了解将binary_crossentropy应用于多类预测时会发生什么。

Let's assume that your output from softmax is (0.1, 0.2, 0.3, 0.4) and one-hot encoded ground truth is (1, 0, 0, 0) .让我们假设softmax的输出是(0.1, 0.2, 0.3, 0.4)并且单热编码的地面实况是(1, 0, 0, 0) 。
binary_crossentropy masks all outputs which are higher than 0.5 so out of your network is turned to (0, 0, 0, 0) vector. binary_crossentropy屏蔽了所有高于0.5输出，因此您的网络将被转换为(0, 0, 0, 0)向量。
(0, 0, 0, 0) matches ground truth (1, 0, 0, 0) on 3 out of 4 indexes - this makes resulting accuracy to be at the level of 75% for a completely wrong answer ! (0, 0, 0, 0)在 4 个索引中的 3 个上匹配地面实况(1, 0, 0, 0) - 这使得对于完全错误的答案产生的准确度达到75%的水平！

To solve this you could use a single class accuracy, eg like this one:为了解决这个问题，您可以使用单个类精度，例如：

def single_class_accuracy(interesting_class_id):
    def fn(y_true, y_pred):
        class_id_preds = K.argmax(y_pred, axis=-1)
        # Replace class_id_preds with class_id_true for recall here
        positive_mask = K.cast(K.equal(class_id_preds, interesting_class_id), 'int32')
        true_mask = K.cast(K.equal(y_true, interesting_class_id), 'int32')
        acc_mask = K.cast(K.equal(positive_mask, true_mask), 'float32')
        class_acc = K.mean(acc_mask)
        return class_acc

    return fn

为什么在多类分类问题中，二进制准确度的准确度高，而分类准确度的准确度低？

问题描述

1 个解决方案

解决方案1
21 2017-09-22 20:20:24

为什么在多类分类问题中，二进制准确度的准确度高，而分类准确度的准确度低？

问题描述

1 个解决方案

解决方案1 21 2017-09-22 20:20:24

解决方案1
21 2017-09-22 20:20:24