简体   繁体   English

LSTM 文本生成 Input_shape

[英]LSTM Text generation Input_shape

I am attempting to add another LSTM layer to my model but I am only a beginner and I am not very good.我正在尝试在我的 model 中添加另一个 LSTM 层,但我只是一个初学者而且我不是很好。 I am using the (Better) - Donal Trump Tweets!我正在使用(更好的)-Donal Trump 推文! dataset on Kaggle for LSTM text generation. Kaggle 上用于 LSTM 文本生成的数据集

I am struggling to get it to run as it returns an Error:我正在努力让它运行,因为它返回一个错误:

<ValueError: Input 0 of layer lstm_16 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 128]>

My model is:我的 model 是:

print('Building model...')
model2 = Sequential()
model2.add(LSTM(128, input_shape=(maxlen, len(chars)),return_sequences=True))
model2.add(Dropout(0.5))
model2.add(LSTM(128))
model2.add(Dropout(0.5))
model2.add(LSTM(128))
model2.add(Dropout(0.2))
model2.add(Dense(len(chars), activation='softmax'))

# optimizer = RMSprop(lr=0.01)
optimizer = Adam()
model.compile(loss='categorical_crossentropy', optimizer=optimizer)
print('model built')

The Model works with only two LSTM layers, two Dropout layers, and one dense layer. Model 仅使用两个 LSTM 层、两个 Dropout 层和一个密集层。 I think something is wrong with my setup for input_shape, but I could be wrong.我认为我的 input_shape 设置有问题,但我可能是错的。 My model is based off of a notebook from the above data set notebook here .我的 model 是基于上述数据集笔记本中笔记本。

In order to stack RNN's you will have to use return_sequences=True .为了堆叠 RNN,您将不得不使用return_sequences=True From the error it could be seen that the layer was expecting 3 dimentional tensor , but received a 2 dimentional .从错误中可以看出,该层期望3 dimentional tensor ,但收到了2 dimentional张量。 Here you can read that that return_sequences=True flag will output a 3 dimentional tensor. 在这里你可以读到return_sequences=True标志将 output 一个3 dimentional张量。

If True the full sequences of successive outputs for each timestep is returned (a 3D tensor of shape (batch_size, timesteps, output_features)).如果为 True,则返回每个时间步长的完整连续输出序列(形状为 (batch_size, timesteps, output_features) 的 3D 张量)。

Assuming, that there are no issues with your input layer and the input data is passed on correctly, I will propose to try the following model.假设您的输入层没有问题并且输入数据正确传递,我将建议尝试以下 model。

print('Building model...')
model2 = Sequential()
model2.add(LSTM(128, input_shape=(maxlen, len(chars)),return_sequences=True))
model2.add(Dropout(0.5))
model2.add(LSTM(128, return_sequences=True))
model2.add(Dropout(0.5))
model2.add(LSTM(128))
model2.add(Dropout(0.2))
model2.add(Dense(len(chars), activation='softmax'))

# optimizer = RMSprop(lr=0.01)
optimizer = Adam()
model.compile(loss='categorical_crossentropy', optimizer=optimizer)
print('model built')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM