I'm trying to build an LSTM model for classifying the ATIS dataset.
From a sentence of undefined size N, I generate a context window word embedding matrix . That's what I need to feed on my model, but I can't figure out how to make it so.
When I define my input layer as:
def build_lstm(input_var=None):
l_in = lasagne.layers.InputLayer(shape=(None, 1, None, None), input_var=input_var)
l_hid = l_lstm = lasagne.layers.LSTMLayer(l_in, num_units=300)
l_out = lasagne.layers.DenseLayer(l_hid, num_units=127, nonlinearity=lasagne.nonlinearities.softmax)
return l_out
I get:
TypeError: unsupported operand type(s) for *: 'NoneType' and 'NoneType'
While if I define the input shape in the l_in declaration it works, for example:
l_in = lasagne.layers.InputLayer(shape=(None, 1, 30, 30), input_var=input_var)
The point is that each sentence has a different size, thus resulting in a context window word embedding matrix of different shape. What can I do?
Because of the way Lasagne/Theano handle initialization of Tensors, you can't simply specify (None, 1, None, None)
. As you've already found out, you need to give a size. In fact, as seen in this example , LSTMLayers seem to expect an input of size (batch size, SEQ_LENGTH, num_features)
.
As I understand it, your options are to:
Some other relevant links:
https://groups.google.com/forum/#!msg/lasagne-users/9nMD5VJPLXA/sNzqxON_DwAJ https://www.reddit.com/r/MachineLearning/comments/3dqdqr/keras_lstm_limitations/
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.