Tensorflow classification model returns incorrect output shape

Question

I am making a simple binary classification model, that takes 30 timestamps with 5 features and should return a probability of a certain class

I've ran into the problem of the model's loss not decreasing over epochs. I've looked into the model's summary and output, and found out that instead of producing a single output number (probability of a class) it instead produces an array of 30 probabilities, which probably leads to it not being able to learn.

The model code is as follows:

print(train['inputs'].shape)  #(3511,30,5)
print(train['labels'].shape)  #(3511,1)

lstm_model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(256),
    tf.keras.layers.Activation('relu'),
    tf.keras.layers.Dense(256),
    tf.keras.layers.Activation('relu'),
    tf.keras.layers.Dense(256),
    tf.keras.layers.Activation('relu'),
    tf.keras.layers.Dense(256),
    tf.keras.layers.Activation('relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
  ])


  lstm_model.compile(
                loss="binary_crossentropy",
                optimizer=tf.optimizers.Adam(learning_rate=0.0001),
                metrics=["accuracy"])

  history = lstm_model.fit(x=train['inputs'], y=train['labels'], epochs=1,
                       validation_data=(val['inputs'], val['labels']),
                      )

The number of layers doesn't seem to impact the issue (added this much trying to overfit the model)

The summary of the model is as follows:

Model: "sequential_108"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_297 (Dense)            (1, 30, 256)              1536      
_________________________________________________________________
activation_128 (Activation)  (1, 30, 256)              0         
_________________________________________________________________
dense_298 (Dense)            (1, 30, 256)              65792     
_________________________________________________________________
activation_129 (Activation)  (1, 30, 256)              0         
_________________________________________________________________
dense_299 (Dense)            (1, 30, 256)              65792     
_________________________________________________________________
activation_130 (Activation)  (1, 30, 256)              0         
_________________________________________________________________
dense_300 (Dense)            (1, 30, 256)              65792     
_________________________________________________________________
activation_131 (Activation)  (1, 30, 256)              0         
_________________________________________________________________
dense_301 (Dense)            (1, 30, 1)                257       
=================================================================
Total params: 199,169
Trainable params: 199,169
Non-trainable params: 0

As you can see, the output layer returns an array of shape (30,1), and the same happens when trying to make actual predictions using the model.

I've also tried to reshape the labels to (3511) and (3511,1,1), but this doesn't seem to have fixed the issue.

What could be causing this behavior?

Answer 1

I assume you want to use LSTM layer, as you are working with 3-dimensional timestamp input.

All you need to do is set return_sequences in the last LSTM layer to False , for example:

lstm_model = tf.keras.models.Sequential([
    tf.keras.layers.LSTM(5, return_sequences=True, dropout=0.2, recurrent_dropout=0.2),
    tf.keras.layers.LSTM(10, return_sequences=True, activation='relu'),
    tf.keras.layers.LSTM(64, return_sequences=False, activation='relu'),
    tf.keras.layers.Dense(256),
    tf.keras.layers.Activation('relu'),
    tf.keras.layers.Dense(256),
    tf.keras.layers.Activation('relu'),
    tf.keras.layers.Dense(256),
    tf.keras.layers.Activation('relu'),
    tf.keras.layers.Dense(256),
    tf.keras.layers.Activation('relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
  ])

Some explanation behind how shapes in LSTM layers work is provided eg in this question:

How to stack multiple lstm in keras?

Tensorflow classification model returns incorrect output shape

Question

1 answers

solution1
0 ACCPTED 2021-01-27 20:29:07

Tensorflow classification model returns incorrect output shape

Question

1 answers

solution1 0 ACCPTED 2021-01-27 20:29:07

solution1
0 ACCPTED 2021-01-27 20:29:07