[英]PyTorch LSTM dropout vs Keras LSTM dropout
I'm trying to port my sequential Keras network to PyTorch.我正在尝试将我的顺序 Keras 网络移植到 PyTorch。 But I'm having trouble with the LSTM units:
但我在使用 LSTM 单元时遇到了问题:
LSTM(512,
stateful = False,
return_sequences = True,
dropout = 0.5),
LSTM(512,
stateful = False,
return_sequences = True,
dropout = 0.5),
How should I formulate this in PyTorch?我应该如何在 PyTorch 中制定这个? Especially dropout appears to work very differently in PyTorch than it does in Keras.
特别是 dropout 在 PyTorch 中的工作方式与在 Keras 中的工作方式截然不同。
The following should work for you.以下内容应该适合您。
lstm = nn.LSTM(
input_size = ?,
hidden_size = 512,
num_layers = 1,
batch_first = True,
dropout = 0.5
)
You need to set the input_size
.您需要设置
input_size
。 Check out the documentation on LSTM .查看有关LSTM的文档。
Update更新
In a 1-layer LSTM, there is no point in assigning dropout since dropout is applied to the outputs of intermediate layers in a multi-layer LSTM module.在 1 层 LSTM 中,分配 dropout 没有意义,因为 dropout 应用于多层 LSTM 模块中的中间层的输出。 So, PyTorch may complain about dropout if
num_layers
is set to 1. If we want to apply dropout at the final layer's output from the LSTM module, we can do something like below.因此,如果
num_layers
设置为 1,PyTorch 可能会抱怨 dropout。如果我们想在 LSTM 模块的最后一层的 output 上应用 dropout,我们可以执行以下操作。
lstm = nn.Sequential(
nn.LSTM(
input_size = ?,
hidden_size = 512,
num_layers = 1,
batch_first = True
),
nn.Dropout(0.5)
)
According to the above definition, the output of the LSTM would pass through a Dropout layer.根据上面的定义,LSTM的output会经过一个Dropout层。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.