使用 Keras 训练 CNN-LSTM 时卡在第一个 epoch

Question

我正在使用 Keras 构建用于推文分类的 CNN-LSTM 模型。 该模型有两个输入，任务是三类分类。 我用来构建模型的代码如下：

def conv2d_lstm_with_author():

    # Get the input information - author & tweet
    author_repre_input = Input(shape=(100,), name='author_input')
    tweet_input = Input(shape=(13, 100, 1), name='tweet_input')

    # Create the convolutional layer and lstm layer
    conv2d = Conv2D(filters = 200, kernel_size = (2, 100), padding='same', activation='relu', 
                    use_bias=True, name='conv_1')(tweet_input)
    flat = Flatten(name='flatten_1')(conv2d)
    reshape_flat = Reshape((260000, 1), name='reshape_1')(flat)
    lstm = LSTM(100, return_state=False, activation='tanh', recurrent_activation='hard_sigmoid', name='lstm_1')(reshape_flat)
    concatenate_layer = concatenate([lstm, author_repre_input], axis=1, name='concat_1')
    dense_1 = Dense(10, activation='relu', name='dense_1')(concatenate_layer)
    output = Dense(3, activation='softmax', kernel_regularizer=regularizers.l2(0.01), name='output_dense')(dense_1)

    # Build the model
    model = Model(inputs=[author_repre_input, tweet_input], outputs=output)
    return model

model = conv2d_lstm_with_author()
model.summary()

optimizer = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])

我的两个输入和标签的形状是：

author_repre_input: (40942, 100)
tweet_input: (40942, 13, 100, 1)
my label Train_Y: (40942, 3)

模型摘要的快照是：

当我使用以下代码训练数据时：

model.fit([author_repre_input, tweet_input], [Train_Y], epochs=20, batch_size=32, validation_split=0.2, 
          shuffle=False, verbose=2)

结果一直停留在第一个纪元，日志没有显示任何有用的信息，只是：

Epoch 1/20

我想知道为什么会发生这种情况。 我使用的 tensorflow 和 keras 版本是：

tensorflow - 1.14.0
keras - 2.2.0

非常感谢您的宝贵时间！

1 月 20 日更新...

我尝试使用 Google Colab 来训练模型。 我在运行模型时检查 RAM。 Colab 为我分配了 25G RAM。 但是，经过几秒钟的训练后，由于占用了所有可用 RAM，会话崩溃了......

我认为模型部分一定有问题......任何建议和见解将不胜感激！

Answer 1

对你来说幸运的是，你没有被卡住。

问题来自这样一个事实，即在您的model.fit ，您指定了参数verbose=2 。

这意味着您的代码只会在一个 epoch 结束时输出消息，而不会在训练过程中输出信息。

要解决您的问题并查看训练进度，请设置verbose=1 。

Answer 2

我想我已经找到了答案......

问题出在卷积层。 内核尺寸太小，导致输出层的维数太高。 为了解决这个问题，我将内核大小从(2, 100)更改为(3, 100) 。 此外，我还在我的模型中添加了 dropout。 我现在使用的模型总结如下：

现在该模型可以在 Google Colab 中顺利运行。

因此，我认为如果出现类似问题，请检查每一层的输出维度。 如果模型创建了非常高维的输出，Keras API 可能会在训练阶段停止。

使用 Keras 训练 CNN-LSTM 时卡在第一个 epoch

问题描述

2 个解决方案

解决方案1
5 已采纳 2020-01-19 09:11:28

解决方案2
0 2020-02-05 03:45:41

使用 Keras 训练 CNN-LSTM 时卡在第一个 epoch

问题描述

2 个解决方案

解决方案1 5 已采纳 2020-01-19 09:11:28

解决方案2 0 2020-02-05 03:45:41

解决方案1
5 已采纳 2020-01-19 09:11:28

解决方案2
0 2020-02-05 03:45:41