[英]How can I setup a Dense bottleneck in a stacked LSTM with Keras?
I have:我有:
self.model.add(Bidirectional(LSTM(lstm1_size, input_shape=(
seq_length, feature_dim), return_sequences=True)))
self.model.add(BatchNormalization())
self.model.add(Dropout(0.2))
self.model.add(Bidirectional(
LSTM(lstm2_size, return_sequences=True)))
self.model.add(BatchNormalization())
self.model.add(Dropout(0.2))
# BOTTLENECK HERE
self.model.add(Bidirectional(
LSTM(lstm3_size, return_sequences=True)))
self.model.add(BatchNormalization())
self.model.add(Dropout(0.2))
self.model.add(Bidirectional(
LSTM(lstm4_size, return_sequences=True)))
self.model.add(BatchNormalization())
self.model.add(Dropout(0.2))
self.model.add(Dense(feature_dim, activation='linear'))
However, I want to set up an autoencoder
-like setup, without having to have 2 separate models.但是,我想设置一个
autoencoder
的设置,而不autoencoder
2 个单独的模型。 Where I have the comment BOTTLENECK HERE
, I want to have a vector of some dimension, say bottleneck_dim
.在我
BOTTLENECK HERE
有评论BOTTLENECK HERE
,我想要一个某些维度的向量,比如bottleneck_dim
。
After that, it should be some LSTM layers that then reconstruct a sequence, of the same dimensions as the initial input.之后,应该是一些 LSTM 层,然后重建与初始输入具有相同维度的序列。 However, I believe that adding a
Dense
layer will not return one vector, but instead return vectors for each of the sequence-length?但是,我相信添加
Dense
层不会返回一个向量,而是返回每个序列长度的向量?
Dense
has been updated to automatically act as if wrapped with TimeDistributed
- ie you'll get (batch_size, seq_length, lstm2_size)
. Dense
已更新为自动充当好像用TimeDistributed
包裹 - 即你会得到(batch_size, seq_length, lstm2_size)
。Flatten()
before it, so Dense
's output shape will be (batch_size, seq_length * lstm2_size)
.Flatten()
,因此Dense
的输出形状将为(batch_size, seq_length * lstm2_size)
。 I wouldn't recommend it, however, as it's likely to corrupt temporal information (you're mixing channels and timesteps).seq_length
, so you can no longer do training or inference on any other seq_length
.seq_length
,因此您不能再对任何其他seq_length
进行训练或推理。 A preferred alternative is Bidirectional(LSTM(..., return_sequences=False))
, which returns only the last timestep's output, shaped (batch_size, lstm_bottleneck_size)
.一个首选的替代方法是
Bidirectional(LSTM(..., return_sequences=False))
,它只返回最后一个时间步的输出,形状为(batch_size, lstm_bottleneck_size)
。 To feed its outputs to the next LSTM, you'll need RepeatVector(seq_length)
after the =False
layer.要将其输出提供给下一个 LSTM,您需要在
=False
层之后使用RepeatVector(seq_length)
。
Do mind the extent of the "bottleneck", though;不过,请注意“瓶颈”的程度; eg if
(seq_length, feature_dim) = (200, 64)
and lstm_bottleneck_size = 400
, that's (1 * 400) / (200 * 64)
= x32 reduction, which is quite large, and may overwhelm the network.例如,如果
(seq_length, feature_dim) = (200, 64)
和lstm_bottleneck_size = 400
,那就是(1 * 400) / (200 * 64)
= x32减少,这是相当大的,并且可能会淹没网络。 I'd suggest with x8 as the goal.我建议以 x8 为目标。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.