Bidirectional/LSTM Deep Learning Model, how should I be defining input_shape?

Question

I am trying to define the input_shape within my model and continuously get one of two errors.

Either:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-39-f61a7f345eb6> in <module>()
----> 1 dnn_model = build_and_compile_model(normalizer)
      2 dnn_model.summary()

<ipython-input-38-b43700ab6d96> in build_and_compile_model(norm)
      8   model = keras.Sequential([
      9       norm,
---> 10       layers.Bidirectional(layers.LSTM(64, return_sequences=True, input_shape=(train_df.shape[1], train_df.shape[2]))),
     11       layers.LSTM(64),
     12       layers.Dense(32, activation='relu'),

IndexError: tuple index out of range

Or:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-41-f61a7f345eb6> in <module>()
----> 1 dnn_model = build_and_compile_model(normalizer)
      2 dnn_model.summary()

11 frames
/usr/local/lib/python3.7/dist-packages/keras/engine/input_spec.py in assert_input_compatibility(input_spec, inputs, layer_name)
    216                          'expected ndim=' + str(spec.ndim) + ', found ndim=' +
    217                          str(ndim) + '. Full shape received: ' +
--> 218                          str(tuple(shape)))
    219     if spec.max_ndim is not None:
    220       ndim = x.shape.rank

ValueError: Input 0 of layer bidirectional_5 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 7)

I get the ValueError when my model looks like this:

def build_and_compile_model(norm):
  model = keras.Sequential([
      norm,
      layers.Bidirectional(layers.LSTM(64, return_sequences=True, input_shape=(train_df.shape))),
      layers.LSTM(64),
      layers.Dense(32, activation='relu'),
      #layers.Dense(32, activation='relu'),
      layers.Dense(1)
  ])

  model.compile(loss='mae',
                #optimizer=tf.keras.optimizers.SGD(learning_rate=0.001, momentum=1))
                optimizer=tf.keras.optimizers.Adam(learning_rate=0.001))
  return model

It's my understanding that my input_shape needs be three-dimensional for an LSTM/RNN model. I'm fairly new to LSTM layers so any input that can further my understanding of this concept would be greatly appreciated!

Answer 1

You need to make sure that your input_shape to your Bidirectional/LSTM has the dimensions timesteps and features , don't worry about the batch dimension:

import tensorflow as tf

def build_and_compile_model(norm, timesteps, features):
  model = tf.keras.Sequential([
      norm,
      tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64, return_sequences=True, input_shape=(timesteps, features))),
      tf.keras.layers.LSTM(64),
      tf.keras.layers.Dense(32, activation='relu'),
      #layers.Dense(32, activation='relu'),
      tf.keras.layers.Dense(1)
  ])

  model.compile(loss='mae',
                #optimizer=tf.keras.optimizers.SGD(learning_rate=0.001, momentum=1))
                optimizer=tf.keras.optimizers.Adam(learning_rate=0.001))
  return model
norm = tf.keras.layers.LayerNormalization(axis=2 , center=True , scale=True)

timesteps = 5
features = 10
batch_size = 32
model = build_and_compile_model(norm, timesteps, features)
model(tf.random.normal((batch_size, timesteps, features)))

Note that you can omit the input_shape specification completely and your model will derive the shape when you pass in real data. If your input data is not of the shape (batch_size, timesteps, features) , you should ask yourself whether an LSTM is the right model for you. In your question, you seem to have the input_shape (batch_size, 7) , which cannot be fed directly to a LSTM. If a LSTM is the right model for you then you need to add another dimension. You could try using tf.repeat or tf.expand_dims :

batch_size = 32
input_data = tf.random.normal((batch_size, 7))
print('Input shape: ', input_data.shape)
repeated_input = tf.reshape(tf.repeat(input = input_data, repeats = 5, axis=1), shape=(batch_size, 5, 7))
print('Input shape after repeat: ', repeated_input.shape)
expanded_input = tf.expand_dims(input=input_data, axis=1)
print('Input shape after expand_dims: ', expanded_input.shape)

Input shape:  (32, 7)
Input shape after repeat:  (32, 5, 7)
Input shape after expand_dims:  (32, 1, 7)

Bidirectional/LSTM Deep Learning Model, how should I be defining input_shape?

Question

1 answers

solution1
0 2021-10-29 06:11:22

Bidirectional/LSTM Deep Learning Model, how should I be defining input_shape?

Question

1 answers

solution1 0 2021-10-29 06:11:22

solution1
0 2021-10-29 06:11:22