训练精度随着训练损失而降低

Question

我写了这个非常简单的代码

model = keras.models.Sequential()
model.add(layers.Dense(13000, input_dim=X_train.shape[1], activation='relu', trainable=False))
model.add(layers.Dense(1, input_dim=13000, activation='linear'))
model.compile(loss="binary_crossentropy", optimizer='adam', metrics=["accuracy"])

model.fit(X_train, y_train, batch_size=X_train.shape[0], epochs=1000000, verbose=1)

数据是 MNIST，但仅适用于数字“0”和“1”。 我有一个非常奇怪的问题，正如预期的那样，损失单调地减少到零，但准确度而不是增加，也在减少。

这是一个示例输出

12665/12665 [==============================] - 0s 11us/step - loss: 0.0107 - accuracy: 0.2355
Epoch 181/1000000

12665/12665 [==============================] - 0s 11us/step - loss: 0.0114 - accuracy: 0.2568
Epoch 182/1000000

12665/12665 [==============================] - 0s 11us/step - loss: 0.0128 - accuracy: 0.2726
Epoch 183/1000000

12665/12665 [==============================] - 0s 11us/step - loss: 0.0133 - accuracy: 0.2839
Epoch 184/1000000

12665/12665 [==============================] - 0s 11us/step - loss: 0.0134 - accuracy: 0.2887
Epoch 185/1000000

12665/12665 [==============================] - 0s 11us/step - loss: 0.0110 - accuracy: 0.2842
Epoch 186/1000000

12665/12665 [==============================] - 0s 11us/step - loss: 0.0101 - accuracy: 0.2722
Epoch 187/1000000

12665/12665 [==============================] - 0s 11us/step - loss: 0.0094 - accuracy: 0.2583

由于我们只有两个类，最低可能准确度的基准应该是 0.5，而且我们正在监控训练集的准确度，所以它应该会达到 100%，我预计过拟合，我根据损失函数过拟合.

在最后一个纪元，情况是这样

12665/12665 [==============================] - 0s 11us/step - loss: 9.9710e-06 - accuracy: 0.0758

当您随机猜测的最坏理论可能性为 50% 时，准确率为 7%。 这绝非偶然。 这里正在发生一些事情。

任何人都可以看到问题吗？

完整代码

from tensorflow import keras
import numpy as np
from matplotlib import pyplot as plt
import keras
from keras.callbacks import Callback
from keras import layers
import warnings

class EarlyStoppingByLossVal(Callback):
    def __init__(self, monitor='val_loss', value=0.00001, verbose=0):
        super(Callback, self).__init__()
        self.monitor = monitor
        self.value = value
        self.verbose = verbose

    def on_epoch_end(self, epoch, logs={}):
        current = logs.get(self.monitor)
        if current is None:
            warnings.warn("Early stopping requires %s available!" % self.monitor, RuntimeWarning)

        if current < self.value:
            if self.verbose > 0:
                print("Epoch %05d: early stopping THR" % epoch)
            self.model.stop_training = True

def load_mnist():

    mnist = keras.datasets.mnist
    (train_images, train_labels), (test_images, test_labels) = mnist.load_data()


    train_images = np.reshape(train_images, (train_images.shape[0], train_images.shape[1] * train_images.shape[2]))
    test_images = np.reshape(test_images, (test_images.shape[0], test_images.shape[1] * test_images.shape[2]))
    train_labels = np.reshape(train_labels, (train_labels.shape[0],))
    test_labels = np.reshape(test_labels, (test_labels.shape[0],))

    train_images = train_images[(train_labels == 0) | (train_labels == 1)]
    test_images = test_images[(test_labels == 0) | (test_labels == 1)]

    train_labels = train_labels[(train_labels == 0) | (train_labels == 1)]
    test_labels = test_labels[(test_labels == 0) | (test_labels == 1)]
    train_images, test_images = train_images / 255, test_images / 255

    return train_images, train_labels, test_images, test_labels



X_train, y_train, X_test, y_test = load_mnist()
train_acc = []
train_errors = []
test_acc = []
test_errors = []

width_list = [13000]
for width in width_list:
    print(width)

    model = keras.models.Sequential()
    model.add(layers.Dense(width, input_dim=X_train.shape[1], activation='relu', trainable=False))
    model.add(layers.Dense(1, input_dim=width, activation='linear'))
    model.compile(loss="binary_crossentropy", optimizer='adam', metrics=["accuracy"])

    callbacks = [EarlyStoppingByLossVal(monitor='loss', value=0.00001, verbose=1)]
    model.fit(X_train, y_train, batch_size=X_train.shape[0], epochs=1000000, verbose=1, callbacks=callbacks)


    train_errors.append(model.evaluate(X_train, y_train)[0])
    test_errors.append(model.evaluate(X_test, y_test)[0])
    train_acc.append(model.evaluate(X_train, y_train)[1])
    test_acc.append(model.evaluate(X_test, y_test)[1])


plt.plot(width_list, train_errors, marker='D')
plt.xlabel("width")
plt.ylabel("train loss")
plt.show()
plt.plot(width_list, test_errors, marker='D')
plt.xlabel("width")
plt.ylabel("test loss")
plt.show()
plt.plot(width_list, train_acc, marker='D')
plt.xlabel("width")
plt.ylabel("train acc")
plt.show()
plt.plot(width_list, test_acc, marker='D')
plt.xlabel("width")
plt.ylabel("test acc")
plt.show()

Answer 1

对于（二元）分类问题，最后一层的线性激活是没有意义的； 将最后一层更改为：

model.add(layers.Dense(1, input_dim=width, activation='sigmoid'))

最后一层的线性激活用于回归问题而不是分类问题。

训练精度随着训练损失而降低

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-10-16 10:50:13

训练精度随着训练损失而降低

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-10-16 10:50:13

解决方案1
1 已采纳 2020-10-16 10:50:13