尝试使用 tf.keras.utils.plot_model 时，带有 TimeDistributed Layers 的 CNN-LSTM 表现异常

Question

I have a CNN-LSTM that looks as follows;我有一个如下所示的 CNN-LSTM；

SEQUENCE_LENGTH = 32
BATCH_SIZE = 32
EPOCHS = 30
n_filters = 64
n_kernel = 1
n_subsequences = 4
n_steps = 8

def DNN_Model(X_train):
    model = Sequential()
    model.add(TimeDistributed(
        Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu', input_shape=(n_subsequences, n_steps, X_train.shape[3]))))
    model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))
    model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
    model.add(TimeDistributed(Flatten()))
    model.add(LSTM(100, activation='relu'))
    model.add(Dense(100, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='mse', optimizer='adam')
    return model

I'm using this CNN-LSTM for a multivariate time series forecasting problem.我正在使用这个 CNN-LSTM 来解决多变量时间序列预测问题。 the CNN-LSTM input data comes in the 4D format: [samples, subsequences, timesteps, features]. CNN-LSTM 输入数据采用 4D 格式：[样本、子序列、时间步长、特征]。 For some reason, I need TimeDistributed Layers;出于某种原因，我需要TimeDistributed Layers； or I get errors like ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 4, 8, 35]或者我收到类似ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 4, 8, 35] ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 4, 8, 35] . ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 4, 8, 35] 。 I think this has to do with the fact that Conv1D is officially not meant for time series, so to preserve time-series data shape we need to use a wrapper layer like TimeDistributed .我认为这与Conv1D正式不适用于时间序列这一事实有关，因此为了保留时间序列数据形状，我们需要使用像TimeDistributed这样的包装层。 I don't really mind using TimeDistributed layers - They're wrappers and if they make my model work I am happy.我真的不介意使用 TimeDistributed 层 - 它们是包装器，如果它们让我的 model 工作我很高兴。 However, when I try to visualize my model with但是，当我尝试用

    file = 'CNN_LSTM_Visualization.png'
    tf.keras.utils.plot_model(model, to_file=file, show_layer_names=False, show_shapes=False)

The resulting visualization only shows the Sequential() :生成的可视化仅显示Sequential() ：

I suspect this has to do with the TimeDistributed layers and the model not being built yet.我怀疑这与 TimeDistributed 层和 model 尚未构建有关。 I cannot call model.summary() either - it throws ValueError: This model has not yet been built. Build the model first by calling我也不能调用model.summary() - 它抛出ValueError: This model has not yet been built. Build the model first by calling ValueError: This model has not yet been built. Build the model first by calling build() or calling fit() with some data, or specify an input_shape argument in the first layer(s) for automatic build Which is strange because I have specified the input_shape, albeit in the Conv1D layer and not in the TimeDistributed wrapper. ValueError: This model has not yet been built. Build the model first by calling build() or calling fit() 来构建 model，或者argument in the first layer(s) for automatic build with some data, or specify an input_shape 参数以进行自动构建这很奇怪，因为我已经指定了 input_shape，尽管在Conv1D层中而不是在TimeDistributed包装器中。

I would like a working model together with a working tf.keras.utils.plot_model function.我想要一个工作 model 和一个工作tf.keras.utils.plot_model function。 Any explanation as to why I need TimeDistributed and why it makes the plot_model function behave weirdly would be greatly awesome.关于为什么我需要 TimeDistributed 以及为什么它使 plot_model function 行为怪异的任何解释都会非常棒。

Answer 1

An alternative to using an Input layer is to simply pass the input_shape to the TimeDistributed wrapper, and not the Conv1D layer:使用Input层的替代方法是简单地将input_shape传递给TimeDistributed包装器，而不是Conv1D层：

def DNN_Model(X_train):
    model = Sequential()
    model.add(TimeDistributed(
        Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu'), input_shape=(n_subsequences, n_steps, X_train.shape[3])))
    model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))
    model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
    model.add(TimeDistributed(Flatten()))
    model.add(LSTM(100, activation='relu'))
    model.add(Dense(100, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='mse', optimizer='adam')
    return model

Answer 2

Add your input layer at the beginning.在开头添加您的输入层。 Try this尝试这个

def DNN_Model(X_train):
    model = Sequential()
    model.add(InputLayer(input_shape=(n_subsequences, n_steps, X_train)))
    model.add(TimeDistributed(
        Conv1D(filters=n_filters, kernel_size=n_kernel,
               activation='relu')))
    model.add(TimeDistributed(Conv1D(filters=n_filters,
              kernel_size=n_kernel, activation='relu')))
    model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
    ....

Now, you can plot and get a summary properly.现在，您可以 plot 并正确获取摘要。

DNN_Model(3).summary() # OK 
tf.keras.utils.plot_model(DNN_Model(3)) # OK

尝试使用 tf.keras.utils.plot_model 时，带有 TimeDistributed Layers 的 CNN-LSTM 表现异常

问题描述

2 个解决方案

解决方案1
2 2021-06-04 17:27:56

解决方案2
1 已采纳 2021-06-04 17:21:15

尝试使用 tf.keras.utils.plot_model 时，带有 TimeDistributed Layers 的 CNN-LSTM 表现异常

问题描述

2 个解决方案

解决方案1 2 2021-06-04 17:27:56

解决方案2 1 已采纳 2021-06-04 17:21:15

解决方案1
2 2021-06-04 17:27:56

解决方案2
1 已采纳 2021-06-04 17:21:15