CNN-LSTM with TimeDistributed Layers behaving weirdly when trying to use tf.keras.utils.plot_model

Question

I have a CNN-LSTM that looks as follows;

SEQUENCE_LENGTH = 32
BATCH_SIZE = 32
EPOCHS = 30
n_filters = 64
n_kernel = 1
n_subsequences = 4
n_steps = 8

def DNN_Model(X_train):
    model = Sequential()
    model.add(TimeDistributed(
        Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu', input_shape=(n_subsequences, n_steps, X_train.shape[3]))))
    model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))
    model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
    model.add(TimeDistributed(Flatten()))
    model.add(LSTM(100, activation='relu'))
    model.add(Dense(100, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='mse', optimizer='adam')
    return model

I'm using this CNN-LSTM for a multivariate time series forecasting problem. the CNN-LSTM input data comes in the 4D format: [samples, subsequences, timesteps, features]. For some reason, I need TimeDistributed Layers; or I get errors like ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 4, 8, 35] ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 4, 8, 35] . I think this has to do with the fact that Conv1D is officially not meant for time series, so to preserve time-series data shape we need to use a wrapper layer like TimeDistributed . I don't really mind using TimeDistributed layers - They're wrappers and if they make my model work I am happy. However, when I try to visualize my model with

    file = 'CNN_LSTM_Visualization.png'
    tf.keras.utils.plot_model(model, to_file=file, show_layer_names=False, show_shapes=False)

The resulting visualization only shows the Sequential() :

I suspect this has to do with the TimeDistributed layers and the model not being built yet. I cannot call model.summary() either - it throws ValueError: This model has not yet been built. Build the model first by calling ValueError: This model has not yet been built. Build the model first by calling build() or calling fit() with some data, or specify an input_shape argument in the first layer(s) for automatic build Which is strange because I have specified the input_shape, albeit in the Conv1D layer and not in the TimeDistributed wrapper.

I would like a working model together with a working tf.keras.utils.plot_model function. Any explanation as to why I need TimeDistributed and why it makes the plot_model function behave weirdly would be greatly awesome.

Answer 1

An alternative to using an Input layer is to simply pass the input_shape to the TimeDistributed wrapper, and not the Conv1D layer:

def DNN_Model(X_train):
    model = Sequential()
    model.add(TimeDistributed(
        Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu'), input_shape=(n_subsequences, n_steps, X_train.shape[3])))
    model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))
    model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
    model.add(TimeDistributed(Flatten()))
    model.add(LSTM(100, activation='relu'))
    model.add(Dense(100, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='mse', optimizer='adam')
    return model

Answer 2

Add your input layer at the beginning. Try this

def DNN_Model(X_train):
    model = Sequential()
    model.add(InputLayer(input_shape=(n_subsequences, n_steps, X_train)))
    model.add(TimeDistributed(
        Conv1D(filters=n_filters, kernel_size=n_kernel,
               activation='relu')))
    model.add(TimeDistributed(Conv1D(filters=n_filters,
              kernel_size=n_kernel, activation='relu')))
    model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
    ....

Now, you can plot and get a summary properly.

DNN_Model(3).summary() # OK 
tf.keras.utils.plot_model(DNN_Model(3)) # OK

CNN-LSTM with TimeDistributed Layers behaving weirdly when trying to use tf.keras.utils.plot_model

Question

2 answers

solution1
2 2021-06-04 17:27:56

solution2
1 ACCPTED 2021-06-04 17:21:15

CNN-LSTM with TimeDistributed Layers behaving weirdly when trying to use tf.keras.utils.plot_model

Question

2 answers

solution1 2 2021-06-04 17:27:56

solution2 1 ACCPTED 2021-06-04 17:21:15

solution1
2 2021-06-04 17:27:56

solution2
1 ACCPTED 2021-06-04 17:21:15