二元分类器只进行真阴性和假阳性预测

Question

I'm building a neural network to classify doublets of 100*80 images into two classes.我正在构建一个神经网络，将 100*80 图像的双峰分类为两类。 My accuracy is capped at around 88% no matter what I try to do (add convolutional layers, dropouts...).无论我尝试做什么（添加卷积层、辍学......），我的准确率都被限制在 88% 左右。 I've investigated the issue and found from the confusion matrix that my model is only making true negative and false positive predictions.我已经调查了这个问题，并从混淆矩阵中发现我的 model 只做出了真阴性和假阳性预测。 I have no idea how this is possible and was wondering if anyone could help me.我不知道这是怎么可能的，我想知道是否有人可以帮助我。 Here is some of the code (I've used a really simple model architecture here):这是一些代码（我在这里使用了一个非常简单的 model 架构）：

X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.2, shuffle = True)

model = keras.models.Sequential()
model.add(keras.layers.Flatten(input_shape = (100,80,2)))
model.add(keras.layers.Dense(5, activation = 'relu'))
model.add(keras.layers.Dense(1, activation = 'sigmoid'))
model.compile(optimizer='adam',
              loss=tf.keras.losses.BinaryCrossentropy(from_logits=False),
              metrics=['accuracy'])

model.fit(X_train, y_train, epochs =10, batch_size= 200, validation_data = (X_test, y_test))

Output for training: Output 用于训练：

Epoch 1/10
167/167 [==============================] - 6s 31ms/step - loss: 0.6633 - accuracy: 0.8707 - val_loss: 0.6345 - val_accuracy: 0.8813
Epoch 2/10
167/167 [==============================] - 2s 13ms/step - loss: 0.6087 - accuracy: 0.8827 - val_loss: 0.5848 - val_accuracy: 0.8813
Epoch 3/10
167/167 [==============================] - 2s 13ms/step - loss: 0.5630 - accuracy: 0.8828 - val_loss: 0.5435 - val_accuracy: 0.8813
Epoch 4/10
167/167 [==============================] - 2s 13ms/step - loss: 0.5249 - accuracy: 0.8828 - val_loss: 0.5090 - val_accuracy: 0.8813
Epoch 5/10
167/167 [==============================] - 2s 12ms/step - loss: 0.4931 - accuracy: 0.8828 - val_loss: 0.4805 - val_accuracy: 0.8813
Epoch 6/10
167/167 [==============================] - 2s 13ms/step - loss: 0.4663 - accuracy: 0.8828 - val_loss: 0.4567 - val_accuracy: 0.8813
Epoch 7/10
167/167 [==============================] - 2s 14ms/step - loss: 0.4424 - accuracy: 0.8832 - val_loss: 0.4363 - val_accuracy: 0.8813
Epoch 8/10
167/167 [==============================] - 3s 17ms/step - loss: 0.4198 - accuracy: 0.8848 - val_loss: 0.4190 - val_accuracy: 0.8816
Epoch 9/10
167/167 [==============================] - 2s 15ms/step - loss: 0.3982 - accuracy: 0.8887 - val_loss: 0.4040 - val_accuracy: 0.8816
Epoch 10/10
167/167 [==============================] - 3s 15ms/step - loss: 0.3784 - accuracy: 0.8942 - val_loss: 0.3911 - val_accuracy: 0.8821
Out[85]:
<keras.callbacks.History at 0x7fe3ce8dedd0>

loss, accuracies = model1.evaluate(X_test, y_test)
261/261 [==============================] - 1s 2ms/step - loss: 0.3263 - accuracy: 0.8813

y_pred = model1.predict(X_test)
y_pred = (y_pred > 0.5)
confusion_matrix((y_test > 0.5), y_pred )

array([[   0,  990],
       [   0, 7353]])

Answer 1

First, check how imbalance is your data.首先，检查您的数据有多不平衡。 If for example your dataset contain 10 samples, which 9 is class A and 1 is of class B .例如，如果您的数据集包含 10 个样本，其中 9 个是 class A和 1 个是 class B 。 So your model likely would want to maximize its acciracy by simply always tell you the class is A - it would still get 90% accuracy.因此，您的 model 可能希望通过简单地总是告诉您 class 是A来最大化其准确性 - 它仍然可以获得 90% 的准确度。 When you actually wish to punish him alot on the unreprented class - ie class B .当你真的想在毫无掩饰的 class 上惩罚他时 - 即 class B 。

So if indeed your data is inbalanced you can change try to change the metric from [accuracy] to ['matthews_correlation'] eg因此，如果您的数据确实不平衡，您可以尝试将指标从[accuracy]更改为['matthews_correlation']例如

model.compile(optimizer='adam',
          loss=tf.keras.losses.BinaryCrossentropy(from_logits=False),
          metrics=['matthews_correlation'])

Which will do what I have explained in the beginning,over punish the mistakes in the unrepresented class.这将按照我在开头解释的那样，过度惩罚未表示的 class 中的错误。

二元分类器只进行真阴性和假阳性预测

问题描述

1 个解决方案

解决方案1
1 2022-01-02 13:42:41

二元分类器只进行真阴性和假阳性预测

问题描述

1 个解决方案

解决方案1 1 2022-01-02 13:42:41

解决方案1
1 2022-01-02 13:42:41