I'm trying to port my sequential Keras network to PyTorch. But I'm having trouble with the LSTM units:
LSTM(512,
stateful = False,
return_sequences = True,
dropout = 0.5),
LSTM(512,
stateful = False,
return_sequences = True,
dropout = 0.5),
How should I formulate this in PyTorch? Especially dropout appears to work very differently in PyTorch than it does in Keras.
The following should work for you.
lstm = nn.LSTM(
input_size = ?,
hidden_size = 512,
num_layers = 1,
batch_first = True,
dropout = 0.5
)
You need to set the input_size
. Check out the documentation on LSTM .
Update
In a 1-layer LSTM, there is no point in assigning dropout since dropout is applied to the outputs of intermediate layers in a multi-layer LSTM module. So, PyTorch may complain about dropout if num_layers
is set to 1. If we want to apply dropout at the final layer's output from the LSTM module, we can do something like below.
lstm = nn.Sequential(
nn.LSTM(
input_size = ?,
hidden_size = 512,
num_layers = 1,
batch_first = True
),
nn.Dropout(0.5)
)
According to the above definition, the output of the LSTM would pass through a Dropout layer.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.