简体   繁体   English

为什么我的自动编码器的损失在训练期间根本没有下降?

[英]Why is the loss of my autoencoder not going down at all during training?

I am following this tutorial to create a Keras-based autoencoder, but using my own data.我正在按照本教程创建一个基于 Keras 的自动编码器,但使用我自己的数据。 That dataset includes about 20k training and about 4k validation images.该数据集包括大约 20k 个训练图像和大约 4k 个验证图像。 All of them are very similar, all show the very same object.它们都非常相似,都显示相同的 object。 I haven't modified the Keras model layout from the tutorial, only changed the input size, since I used 300x300 images.我没有修改教程中的 Keras model 布局,只更改了输入大小,因为我使用了 300x300 图像。 So my model looks like this:所以我的 model 看起来像这样:

Model: "autoencoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_1 (InputLayer)         [(None, 300, 300, 1)]     0
_________________________________________________________________
encoder (Functional)         (None, 16)                5779216
_________________________________________________________________
decoder (Functional)         (None, 300, 300, 1)       6176065
=================================================================
Total params: 11,955,281
Trainable params: 11,954,897
Non-trainable params: 384
_________________________________________________________________
Model: "encoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_1 (InputLayer)         [(None, 300, 300, 1)]     0
_________________________________________________________________
conv2d (Conv2D)              (None, 150, 150, 32)      320
_________________________________________________________________
leaky_re_lu (LeakyReLU)      (None, 150, 150, 32)      0
_________________________________________________________________
batch_normalization (BatchNo (None, 150, 150, 32)      128
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 75, 75, 64)        18496
_________________________________________________________________
leaky_re_lu_1 (LeakyReLU)    (None, 75, 75, 64)        0
_________________________________________________________________
batch_normalization_1 (Batch (None, 75, 75, 64)        256
_________________________________________________________________
flatten (Flatten)            (None, 360000)            0
_________________________________________________________________
dense (Dense)                (None, 16)                5760016
=================================================================
Total params: 5,779,216
Trainable params: 5,779,024
Non-trainable params: 192
_________________________________________________________________
Model: "decoder"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_2 (InputLayer)         [(None, 16)]              0
_________________________________________________________________
dense_1 (Dense)              (None, 360000)            6120000
_________________________________________________________________
reshape (Reshape)            (None, 75, 75, 64)        0
_________________________________________________________________
conv2d_transpose (Conv2DTran (None, 150, 150, 64)      36928
_________________________________________________________________
leaky_re_lu_2 (LeakyReLU)    (None, 150, 150, 64)      0
_________________________________________________________________
batch_normalization_2 (Batch (None, 150, 150, 64)      256
_________________________________________________________________
conv2d_transpose_1 (Conv2DTr (None, 300, 300, 32)      18464
_________________________________________________________________
leaky_re_lu_3 (LeakyReLU)    (None, 300, 300, 32)      0
_________________________________________________________________
batch_normalization_3 (Batch (None, 300, 300, 32)      128
_________________________________________________________________
conv2d_transpose_2 (Conv2DTr (None, 300, 300, 1)       289
_________________________________________________________________
activation (Activation)      (None, 300, 300, 1)       0
=================================================================
Total params: 6,176,065
Trainable params: 6,175,873
Non-trainable params: 192

Then I initialize my model like this:然后我像这样初始化我的 model:

IMGSIZE = 300
EPOCHS = 20
LR = 0.0001

(encoder, decoder, autoencoder) = ConvAutoencoder.build(IMGSIZE, IMGSIZE, 1)
sched = ExponentialDecay(initial_learning_rate=LR, decay_steps=EPOCHS, decay_rate=LR / EPOCHS)
autoencoder.compile(loss="mean_squared_error", optimizer=Adam(learning_rate=sched))

Then I train my model like this:然后我像这样训练我的 model:

image_generator = ImageDataGenerator(rescale=1.0 / 255)
train_gen = image_generator.flow_from_directory(
    os.path.join(args.images, "training"),
    class_mode="input",
    color_mode="grayscale",
    target_size=(IMGSIZE, IMGSIZE),
    batch_size=BS,
)
val_gen = image_generator.flow_from_directory(
    os.path.join(args.images, "validation"),
    class_mode="input",
    color_mode="grayscale",
    target_size=(IMGSIZE, IMGSIZE),
    batch_size=BS,
)
hist = autoencoder.fit(train_gen, validation_data=val_gen, epochs=EPOCHS, batch_size=BS)

My batch size BS is 32 and I start with an initial Adam learning rate of 0.001 (but I also tried values like 0.1 down to 0.0001).我的批量BS是 32,我从初始 Adam 学习率 0.001 开始(但我也尝试了从 0.1 到 0.0001 的值)。 I also tried to increase the latent dimensionality to something like 1024, but that doesn't solve my issue either.我还尝试将潜在维度增加到 1024 之类的东西,但这也不能解决我的问题。

Now during training the loss goes down in the first epoch from about 0.5 to about 0.2 - and then beginning from the second epoch that loss sticks at the very same value, eg 0.1989, and then it stays there "forever", regardless of how many epochs I train and/or the initial learning rate I use.现在在训练期间,损失在第一个时期从大约 0.5 下降到大约 0.2 - 然后从第二个时期开始,损失保持在相同的值,例如 0.1989,然后它“永远”保持在那里,不管有多少我训练的时期和/或我使用的初始学习率。

Any ideas what could be the problem here?有什么想法可能是这里的问题吗?

It could be that the decay_rate argument in tf.keras.optimizers.schedules.ExponentialDecay is decaying your learning rate quicker than you think it is, effectively making your learning rate zero.可能是tf.keras.optimizers.schedules.ExponentialDecay中的decay_rate参数比您想象的更快地衰减您的学习率,从而有效地使您的学习率为零。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为什么在训练我的网络时CrossEntropy损失不减少? - Why does the CrossEntropy loss not go down during training of my network? 为什么在 CNN 迁移学习期间,我的损失和准确率会随着每个 epoch 不断上升和下降? - Why does my loss and accuracy keep going up and down with each epoch during CNN transfer learning? 在自动编码器的训练过程中,喀拉拉邦的自定义损失会产生误导性的输出 - Custom loss in keras produces misleading outputs during training of an autoencoder 训练时如何解决损失不减少的问题 - How to fix loss not going down when training 为什么我的损失趋于下降,而我的准确度却为零? - Why is my loss trending down while my accuracy is going to zero? 为什么训练损失会上下波动? - Why is training loss oscilating up and down? 为什么我的 val_loss 下降但我的 val_accuracy 停滞不前 - Why is my val_loss going down but my val_accuracy stagnates 为什么我的 CNN 的准确率/损失在训练期间没有变化? - Why doesn't my CNN's accuracy/loss change during training? 为什么我的验证损失低于我的训练损失? - Why is my validation loss lower than my training loss? Keras 自动编码器:验证损失 > 训练损失 - 但在测试数据集上表现良好 - Keras autoencoder : validation loss > training loss - but performing well on testing dataset
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM