简体   繁体   中英

Bidirectional/LSTM Deep Learning Model, how should I be defining input_shape?

I am trying to define the input_shape within my model and continuously get one of two errors.

Either:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-39-f61a7f345eb6> in <module>()
----> 1 dnn_model = build_and_compile_model(normalizer)
      2 dnn_model.summary()

<ipython-input-38-b43700ab6d96> in build_and_compile_model(norm)
      8   model = keras.Sequential([
      9       norm,
---> 10       layers.Bidirectional(layers.LSTM(64, return_sequences=True, input_shape=(train_df.shape[1], train_df.shape[2]))),
     11       layers.LSTM(64),
     12       layers.Dense(32, activation='relu'),

IndexError: tuple index out of range

Or:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-41-f61a7f345eb6> in <module>()
----> 1 dnn_model = build_and_compile_model(normalizer)
      2 dnn_model.summary()

11 frames
/usr/local/lib/python3.7/dist-packages/keras/engine/input_spec.py in assert_input_compatibility(input_spec, inputs, layer_name)
    216                          'expected ndim=' + str(spec.ndim) + ', found ndim=' +
    217                          str(ndim) + '. Full shape received: ' +
--> 218                          str(tuple(shape)))
    219     if spec.max_ndim is not None:
    220       ndim = x.shape.rank

ValueError: Input 0 of layer bidirectional_5 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 7)

I get the ValueError when my model looks like this:

def build_and_compile_model(norm):
  model = keras.Sequential([
      norm,
      layers.Bidirectional(layers.LSTM(64, return_sequences=True, input_shape=(train_df.shape))),
      layers.LSTM(64),
      layers.Dense(32, activation='relu'),
      #layers.Dense(32, activation='relu'),
      layers.Dense(1)
  ])

  model.compile(loss='mae',
                #optimizer=tf.keras.optimizers.SGD(learning_rate=0.001, momentum=1))
                optimizer=tf.keras.optimizers.Adam(learning_rate=0.001))
  return model

It's my understanding that my input_shape needs be three-dimensional for an LSTM/RNN model. I'm fairly new to LSTM layers so any input that can further my understanding of this concept would be greatly appreciated!

You need to make sure that your input_shape to your Bidirectional/LSTM has the dimensions timesteps and features , don't worry about the batch dimension:

import tensorflow as tf

def build_and_compile_model(norm, timesteps, features):
  model = tf.keras.Sequential([
      norm,
      tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64, return_sequences=True, input_shape=(timesteps, features))),
      tf.keras.layers.LSTM(64),
      tf.keras.layers.Dense(32, activation='relu'),
      #layers.Dense(32, activation='relu'),
      tf.keras.layers.Dense(1)
  ])

  model.compile(loss='mae',
                #optimizer=tf.keras.optimizers.SGD(learning_rate=0.001, momentum=1))
                optimizer=tf.keras.optimizers.Adam(learning_rate=0.001))
  return model
norm = tf.keras.layers.LayerNormalization(axis=2 , center=True , scale=True)

timesteps = 5
features = 10
batch_size = 32
model = build_and_compile_model(norm, timesteps, features)
model(tf.random.normal((batch_size, timesteps, features)))

Note that you can omit the input_shape specification completely and your model will derive the shape when you pass in real data. If your input data is not of the shape (batch_size, timesteps, features) , you should ask yourself whether an LSTM is the right model for you. In your question, you seem to have the input_shape (batch_size, 7) , which cannot be fed directly to a LSTM. If a LSTM is the right model for you then you need to add another dimension. You could try using tf.repeat or tf.expand_dims :

batch_size = 32
input_data = tf.random.normal((batch_size, 7))
print('Input shape: ', input_data.shape)
repeated_input = tf.reshape(tf.repeat(input = input_data, repeats = 5, axis=1), shape=(batch_size, 5, 7))
print('Input shape after repeat: ', repeated_input.shape)
expanded_input = tf.expand_dims(input=input_data, axis=1)
print('Input shape after expand_dims: ', expanded_input.shape)
Input shape:  (32, 7)
Input shape after repeat:  (32, 5, 7)
Input shape after expand_dims:  (32, 1, 7)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM