Keras LSTM输入层形状与实际输入不同

Question

Given that I'm not very experienced with this, the following may well be a silly question (and the title equally beside the point, any suggestions for modification are welcome). 鉴于我对此不太有经验，以下内容很可能是一个愚蠢的问题（标题也很重要，欢迎提出任何修改建议）。 I'm trying to get a Keras Model to work with multiple inputs, but keep running into problems with the input dimension(s). 我正在尝试使Keras模型能够与多个输入一起使用，但始终会遇到输入维的问题。 Quite possibly the setup of my network makes only little sense, but I first would like to produce something that works (ie executes) and then experiment with different setups. 我的网络设置很可能没有什么意义，但是我首先想生产出可以运行（即执行）的东西，然后尝试不同的设置。 Here's what I have now: 这是我现在所拥有的：

sent = Input(shape=(None,inputdim))
pos = Input(shape=(None,1))

l1 = LSTM(40)(sent)
l2 = LSTM(40)(pos)
out = concatenate([l1, l2])
output = Dense(1, activation='sigmoid')(out)

model = Model(inputs=[sent, pos], outputs=output)
model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy'])
print('X1 shape:', np.shape(X1_train))
print('X1 Input shape:', np.shape(sent))
print('X2 shape:', np.shape(X2_train))
print('X2 Input shape:', np.shape(pos))

model.fit([X1_train, X2_train], Y_train, batch_size=1, epochs=nrEpochs)

This gets me the following output/error: 这使我得到以下输出/错误：

Using TensorFlow backend.
INFO: Starting iteration 1 of 1...
INFO: Starting with training of LSTM network.
X1 shape: (3065,)
X1 Input shape: (?, ?, 21900)
X2 shape: (3065, 1)
X2 Input shape: (?, ?, 1)
Traceback (most recent call last):
  ...
ValueError: Error when checking input: expected input_1 to have 3 dimensions, 
but got array with shape (3065, 1)

If I understand things correctly (which I'm not at all sure about :), Input basically converts the input to a tensor, adding a third dimension (in my case), but the input I feed the model when doing model.fit() is still two-dimensional. 如果我理解正确（我一点也不知道:)， Input基本上会将输入转换为张量，并添加第三维（以我为例），但是在执行model.fit()仍然是二维的。 Any ideas on how to go about this are very welcome. 任何关于如何做到这一点的想法都非常欢迎。

Answer 1

You should understand better how LSTM work. 您应该更好地了解LSTM是如何工作的。 An LSTM (as all recurrent neural networks units as GRU and RNN) expect an input that is shaped as follow: batch, time_steps, token_dimensions. LSTM（作为所有递归神经网络单位，如GRU和RNN）期望输入的形状如下：batch，time_steps，token_dimensions。

The first dimension is the batch_size (ie the number of examples you want to feed together to the network (this speed up the training because they can be processed in parallel). 第一个维度是batch_size（即要一起馈送到网络的示例数（这可以加快训练速度，因为它们可以并行处理）。
The second dimension (time_steps) is the length of your sequence and it has to be fixed. 第二个维度（time_steps）是序列的长度，必须固定。 So for example is the longest sequence in your training data is 70 you might want to set time_steps = 70. If it is too long you can choose an arbitrary len and truncate your sentences. 例如，您的训练数据中最长的序列是70，则可能需要设置time_steps =70。如果太长，则可以选择任意len并截断句子。
The third dimension is the size of each word (token) in the embedding space or the size of your vocabulary if you are directly feeding one-hot representation of the word to the LSTM (I discourage you to do so!). 第三维是嵌入空间中每个单词（令牌）的大小，或者如果您直接向LSTM提供单词的即时表示（我不鼓励您这样做），那么您的词汇表的大小。

In case you don't know about embeddings and how to use them in Keras you can give a look here https://keras.io/layers/embeddings/ 如果您不了解嵌入以及如何在Keras中使用它们，可以在这里看看https://keras.io/layers/embeddings/

Just to give you an idea of how the code should look like i paste here how I modified your code to make it work: 只是为了让您了解代码的外观，我在此处粘贴了如何修改代码以使其正常工作：

sent = Input(shape=(time_steps,))
pos = Input(shape=(time_steps2,))
lstm_in = Embeddings(vocab_size, 300)(sent) #now you have batch x time_steps x 300 tensor
lstm_in2 = Embeddings(vocab_size2, 100)(pos) 
l1 = LSTM(40)(lstm_in)
l2 = LSTM(40)(lstm_in2)
out = concatenate([l1, l2])
output = Dense(1, activation='sigmoid')(out)

model = Model(inputs=[sent, pos], outputs=output)

Note that the two inputs can have different number of timesteps. 请注意，两个输入可以具有不同数量的时间步长。 If the second one has only one then pass it through a Dense layer and not an LSTM. 如果第二个只有一个，则将其通过密集层而不是LSTM。

Keras LSTM输入层形状与实际输入不同

问题描述

1 个解决方案

解决方案1
0 2018-08-16 16:27:54

Keras LSTM输入层形状与实际输入不同

问题描述

1 个解决方案

解决方案1 0 2018-08-16 16:27:54

解决方案1
0 2018-08-16 16:27:54