繁体   English   中英

训练精度随着训练损失而降低

[英]Train accuracy decreases with train loss

我写了这个非常简单的代码

model = keras.models.Sequential()
model.add(layers.Dense(13000, input_dim=X_train.shape[1], activation='relu', trainable=False))
model.add(layers.Dense(1, input_dim=13000, activation='linear'))
model.compile(loss="binary_crossentropy", optimizer='adam', metrics=["accuracy"])

model.fit(X_train, y_train, batch_size=X_train.shape[0], epochs=1000000, verbose=1)

数据是 MNIST,但仅适用于数字“0”和“1”。 我有一个非常奇怪的问题,正如预期的那样,损失单调地减少到零,但准确度而不是增加,也在减少。

这是一个示例输出

12665/12665 [==============================] - 0s 11us/step - loss: 0.0107 - accuracy: 0.2355
Epoch 181/1000000

12665/12665 [==============================] - 0s 11us/step - loss: 0.0114 - accuracy: 0.2568
Epoch 182/1000000

12665/12665 [==============================] - 0s 11us/step - loss: 0.0128 - accuracy: 0.2726
Epoch 183/1000000

12665/12665 [==============================] - 0s 11us/step - loss: 0.0133 - accuracy: 0.2839
Epoch 184/1000000

12665/12665 [==============================] - 0s 11us/step - loss: 0.0134 - accuracy: 0.2887
Epoch 185/1000000

12665/12665 [==============================] - 0s 11us/step - loss: 0.0110 - accuracy: 0.2842
Epoch 186/1000000

12665/12665 [==============================] - 0s 11us/step - loss: 0.0101 - accuracy: 0.2722
Epoch 187/1000000

12665/12665 [==============================] - 0s 11us/step - loss: 0.0094 - accuracy: 0.2583

由于我们只有两个类,最低可能准确度的基准应该是 0.5,而且我们正在监控训练集的准确度,所以它应该会达到 100%,我预计过拟合,我根据损失函数过拟合.

在最后一个纪元,情况是这样

12665/12665 [==============================] - 0s 11us/step - loss: 9.9710e-06 - accuracy: 0.0758

当您随机猜测的最坏理论可能性为 50% 时,准确率为 7%。 这绝非偶然。 这里正在发生一些事情。

任何人都可以看到问题吗?

完整代码

from tensorflow import keras
import numpy as np
from matplotlib import pyplot as plt
import keras
from keras.callbacks import Callback
from keras import layers
import warnings

class EarlyStoppingByLossVal(Callback):
    def __init__(self, monitor='val_loss', value=0.00001, verbose=0):
        super(Callback, self).__init__()
        self.monitor = monitor
        self.value = value
        self.verbose = verbose

    def on_epoch_end(self, epoch, logs={}):
        current = logs.get(self.monitor)
        if current is None:
            warnings.warn("Early stopping requires %s available!" % self.monitor, RuntimeWarning)

        if current < self.value:
            if self.verbose > 0:
                print("Epoch %05d: early stopping THR" % epoch)
            self.model.stop_training = True

def load_mnist():

    mnist = keras.datasets.mnist
    (train_images, train_labels), (test_images, test_labels) = mnist.load_data()


    train_images = np.reshape(train_images, (train_images.shape[0], train_images.shape[1] * train_images.shape[2]))
    test_images = np.reshape(test_images, (test_images.shape[0], test_images.shape[1] * test_images.shape[2]))
    train_labels = np.reshape(train_labels, (train_labels.shape[0],))
    test_labels = np.reshape(test_labels, (test_labels.shape[0],))

    train_images = train_images[(train_labels == 0) | (train_labels == 1)]
    test_images = test_images[(test_labels == 0) | (test_labels == 1)]

    train_labels = train_labels[(train_labels == 0) | (train_labels == 1)]
    test_labels = test_labels[(test_labels == 0) | (test_labels == 1)]
    train_images, test_images = train_images / 255, test_images / 255

    return train_images, train_labels, test_images, test_labels



X_train, y_train, X_test, y_test = load_mnist()
train_acc = []
train_errors = []
test_acc = []
test_errors = []

width_list = [13000]
for width in width_list:
    print(width)

    model = keras.models.Sequential()
    model.add(layers.Dense(width, input_dim=X_train.shape[1], activation='relu', trainable=False))
    model.add(layers.Dense(1, input_dim=width, activation='linear'))
    model.compile(loss="binary_crossentropy", optimizer='adam', metrics=["accuracy"])

    callbacks = [EarlyStoppingByLossVal(monitor='loss', value=0.00001, verbose=1)]
    model.fit(X_train, y_train, batch_size=X_train.shape[0], epochs=1000000, verbose=1, callbacks=callbacks)


    train_errors.append(model.evaluate(X_train, y_train)[0])
    test_errors.append(model.evaluate(X_test, y_test)[0])
    train_acc.append(model.evaluate(X_train, y_train)[1])
    test_acc.append(model.evaluate(X_test, y_test)[1])


plt.plot(width_list, train_errors, marker='D')
plt.xlabel("width")
plt.ylabel("train loss")
plt.show()
plt.plot(width_list, test_errors, marker='D')
plt.xlabel("width")
plt.ylabel("test loss")
plt.show()
plt.plot(width_list, train_acc, marker='D')
plt.xlabel("width")
plt.ylabel("train acc")
plt.show()
plt.plot(width_list, test_acc, marker='D')
plt.xlabel("width")
plt.ylabel("test acc")
plt.show()

对于(二元)分类问题,最后一层的线性激活是没有意义的 将最后一层更改为:

model.add(layers.Dense(1, input_dim=width, activation='sigmoid'))

最后一层的线性激活用于回归问题而不是分类问题。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM