[英]CNN-LSTM with TimeDistributed Layers behaving weirdly when trying to use tf.keras.utils.plot_model
I have a CNN-LSTM that looks as follows;我有一个如下所示的 CNN-LSTM;
SEQUENCE_LENGTH = 32
BATCH_SIZE = 32
EPOCHS = 30
n_filters = 64
n_kernel = 1
n_subsequences = 4
n_steps = 8
def DNN_Model(X_train):
model = Sequential()
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu', input_shape=(n_subsequences, n_steps, X_train.shape[3]))))
model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(100, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mse', optimizer='adam')
return model
I'm using this CNN-LSTM for a multivariate time series forecasting problem.我正在使用这个 CNN-LSTM 来解决多变量时间序列预测问题。 the CNN-LSTM input data comes in the 4D format: [samples, subsequences, timesteps, features].
CNN-LSTM 输入数据采用 4D 格式:[样本、子序列、时间步长、特征]。 For some reason, I need
TimeDistributed
Layers;出于某种原因,我需要
TimeDistributed
Layers; or I get errors like ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 4, 8, 35]
或者我收到类似
ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 4, 8, 35]
ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 4, 8, 35]
. ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 4, 8, 35]
。 I think this has to do with the fact that Conv1D
is officially not meant for time series, so to preserve time-series data shape we need to use a wrapper layer like TimeDistributed
.我认为这与
Conv1D
正式不适用于时间序列这一事实有关,因此为了保留时间序列数据形状,我们需要使用像TimeDistributed
这样的包装层。 I don't really mind using TimeDistributed layers - They're wrappers and if they make my model work I am happy.我真的不介意使用 TimeDistributed 层 - 它们是包装器,如果它们让我的 model 工作我很高兴。 However, when I try to visualize my model with
但是,当我尝试用
file = 'CNN_LSTM_Visualization.png'
tf.keras.utils.plot_model(model, to_file=file, show_layer_names=False, show_shapes=False)
The resulting visualization only shows the Sequential()
:生成的可视化仅显示
Sequential()
:
I suspect this has to do with the TimeDistributed layers and the model not being built yet.我怀疑这与 TimeDistributed 层和 model 尚未构建有关。 I cannot call
model.summary()
either - it throws ValueError: This model has not yet been built. Build the model first by calling
我也不能调用
model.summary()
- 它抛出ValueError: This model has not yet been built. Build the model first by calling
ValueError: This model has not yet been built. Build the model first by calling
build() or calling
fit() with some data, or specify an
input_shape argument in the first layer(s) for automatic build
Which is strange because I have specified the input_shape, albeit in the Conv1D
layer and not in the TimeDistributed
wrapper. ValueError: This model has not yet been built. Build the model first by calling
build() or calling
fit() 来构建 model,或者argument in the first layer(s) for automatic build
with some data, or specify an
input_shape 参数以进行自动构建这很奇怪,因为我已经指定了 input_shape,尽管在Conv1D
层中而不是在TimeDistributed
包装器中。
I would like a working model together with a working tf.keras.utils.plot_model
function.我想要一个工作 model 和一个工作
tf.keras.utils.plot_model
function。 Any explanation as to why I need TimeDistributed and why it makes the plot_model function behave weirdly would be greatly awesome.关于为什么我需要 TimeDistributed 以及为什么它使 plot_model function 行为怪异的任何解释都会非常棒。
An alternative to using an Input
layer is to simply pass the input_shape
to the TimeDistributed
wrapper, and not the Conv1D
layer:使用
Input
层的替代方法是简单地将input_shape
传递给TimeDistributed
包装器,而不是Conv1D
层:
def DNN_Model(X_train):
model = Sequential()
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu'), input_shape=(n_subsequences, n_steps, X_train.shape[3])))
model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(100, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mse', optimizer='adam')
return model
Add your input layer at the beginning.在开头添加您的输入层。 Try this
尝试这个
def DNN_Model(X_train):
model = Sequential()
model.add(InputLayer(input_shape=(n_subsequences, n_steps, X_train)))
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel,
activation='relu')))
model.add(TimeDistributed(Conv1D(filters=n_filters,
kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
....
Now, you can plot and get a summary properly.现在,您可以 plot 并正确获取摘要。
DNN_Model(3).summary() # OK
tf.keras.utils.plot_model(DNN_Model(3)) # OK
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.