Given that I'm not very experienced with this, the following may well be a silly question (and the title equally beside the point, any suggestions for modification are welcome). I'm trying to get a Keras Model to work with multiple inputs, but keep running into problems with the input dimension(s). Quite possibly the setup of my network makes only little sense, but I first would like to produce something that works (ie executes) and then experiment with different setups. Here's what I have now:
sent = Input(shape=(None,inputdim))
pos = Input(shape=(None,1))
l1 = LSTM(40)(sent)
l2 = LSTM(40)(pos)
out = concatenate([l1, l2])
output = Dense(1, activation='sigmoid')(out)
model = Model(inputs=[sent, pos], outputs=output)
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
print('X1 shape:', np.shape(X1_train))
print('X1 Input shape:', np.shape(sent))
print('X2 shape:', np.shape(X2_train))
print('X2 Input shape:', np.shape(pos))
model.fit([X1_train, X2_train], Y_train, batch_size=1, epochs=nrEpochs)
This gets me the following output/error:
Using TensorFlow backend.
INFO: Starting iteration 1 of 1...
INFO: Starting with training of LSTM network.
X1 shape: (3065,)
X1 Input shape: (?, ?, 21900)
X2 shape: (3065, 1)
X2 Input shape: (?, ?, 1)
Traceback (most recent call last):
...
ValueError: Error when checking input: expected input_1 to have 3 dimensions,
but got array with shape (3065, 1)
If I understand things correctly (which I'm not at all sure about :), Input
basically converts the input to a tensor, adding a third dimension (in my case), but the input I feed the model when doing model.fit()
is still two-dimensional. Any ideas on how to go about this are very welcome.
You should understand better how LSTM work. An LSTM (as all recurrent neural networks units as GRU and RNN) expect an input that is shaped as follow: batch, time_steps, token_dimensions.
In case you don't know about embeddings and how to use them in Keras you can give a look here https://keras.io/layers/embeddings/
Just to give you an idea of how the code should look like i paste here how I modified your code to make it work:
sent = Input(shape=(time_steps,))
pos = Input(shape=(time_steps2,))
lstm_in = Embeddings(vocab_size, 300)(sent) #now you have batch x time_steps x 300 tensor
lstm_in2 = Embeddings(vocab_size2, 100)(pos)
l1 = LSTM(40)(lstm_in)
l2 = LSTM(40)(lstm_in2)
out = concatenate([l1, l2])
output = Dense(1, activation='sigmoid')(out)
model = Model(inputs=[sent, pos], outputs=output)
Note that the two inputs can have different number of timesteps. If the second one has only one then pass it through a Dense layer and not an LSTM.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.