如何使用 Keras 在堆叠的 LSTM 中设置密集瓶颈？

Question

I have:我有：

        self.model.add(Bidirectional(LSTM(lstm1_size, input_shape=(
            seq_length, feature_dim), return_sequences=True)))
        self.model.add(BatchNormalization())
        self.model.add(Dropout(0.2))

        self.model.add(Bidirectional(
            LSTM(lstm2_size, return_sequences=True)))
        self.model.add(BatchNormalization())
        self.model.add(Dropout(0.2))

        # BOTTLENECK HERE

        self.model.add(Bidirectional(
            LSTM(lstm3_size, return_sequences=True)))
        self.model.add(BatchNormalization())
        self.model.add(Dropout(0.2))

        self.model.add(Bidirectional(
            LSTM(lstm4_size, return_sequences=True)))
        self.model.add(BatchNormalization())
        self.model.add(Dropout(0.2))

        self.model.add(Dense(feature_dim, activation='linear'))

However, I want to set up an autoencoder -like setup, without having to have 2 separate models.但是，我想设置一个autoencoder的设置，而不autoencoder 2 个单独的模型。 Where I have the comment BOTTLENECK HERE , I want to have a vector of some dimension, say bottleneck_dim .在我BOTTLENECK HERE有评论BOTTLENECK HERE ，我想要一个某些维度的向量，比如bottleneck_dim 。

After that, it should be some LSTM layers that then reconstruct a sequence, of the same dimensions as the initial input.之后，应该是一些 LSTM 层，然后重建与初始输入具有相同维度的序列。 However, I believe that adding a Dense layer will not return one vector, but instead return vectors for each of the sequence-length?但是，我相信添加Dense层不会返回一个向量，而是返回每个序列长度的向量？

Answer 1

Dense has been updated to automatically act as if wrapped with TimeDistributed - ie you'll get (batch_size, seq_length, lstm2_size) . Dense已更新为自动充当好像用TimeDistributed包裹 - 即你会得到(batch_size, seq_length, lstm2_size) 。
A workaround is to place a Flatten() before it, so Dense 's output shape will be (batch_size, seq_length * lstm2_size) .一种解决方法是在它之前放置一个Flatten() ，因此Dense的输出形状将为(batch_size, seq_length * lstm2_size) 。 I wouldn't recommend it, however, as it's likely to corrupt temporal information (you're mixing channels and timesteps).但是，我不推荐它，因为它可能会破坏时间信息（您正在混合通道和时间步长）。 Further, it constrains the network to seq_length , so you can no longer do training or inference on any other seq_length .此外，它将网络限制为seq_length ，因此您不能再对任何其他seq_length进行训练或推理。

A preferred alternative is Bidirectional(LSTM(..., return_sequences=False)) , which returns only the last timestep's output, shaped (batch_size, lstm_bottleneck_size) .一个首选的替代方法是Bidirectional(LSTM(..., return_sequences=False)) ，它只返回最后一个时间步的输出，形状为(batch_size, lstm_bottleneck_size) 。 To feed its outputs to the next LSTM, you'll need RepeatVector(seq_length) after the =False layer.要将其输出提供给下一个 LSTM，您需要在=False层之后使用RepeatVector(seq_length) 。

Do mind the extent of the "bottleneck", though;不过，请注意“瓶颈”的程度； eg if (seq_length, feature_dim) = (200, 64) and lstm_bottleneck_size = 400 , that's (1 * 400) / (200 * 64) = x32 reduction, which is quite large, and may overwhelm the network.例如，如果(seq_length, feature_dim) = (200, 64)和lstm_bottleneck_size = 400 ，那就是(1 * 400) / (200 * 64) = x32减少，这是相当大的，并且可能会淹没网络。 I'd suggest with x8 as the goal.我建议以 x8 为目标。

如何使用 Keras 在堆叠的 LSTM 中设置密集瓶颈？

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-02-25 14:24:33

如何使用 Keras 在堆叠的 LSTM 中设置密集瓶颈？

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-02-25 14:24:33

解决方案1
2 已采纳 2020-02-25 14:24:33