![](/img/trans.png)
[英]Is there a way to show activation function in model plots tensorflow ( tf.keras.utils.plot_model() )?
[英]CNN-LSTM with TimeDistributed Layers behaving weirdly when trying to use tf.keras.utils.plot_model
我有一个如下所示的 CNN-LSTM;
SEQUENCE_LENGTH = 32
BATCH_SIZE = 32
EPOCHS = 30
n_filters = 64
n_kernel = 1
n_subsequences = 4
n_steps = 8
def DNN_Model(X_train):
model = Sequential()
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu', input_shape=(n_subsequences, n_steps, X_train.shape[3]))))
model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(100, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mse', optimizer='adam')
return model
我正在使用这个 CNN-LSTM 来解决多变量时间序列预测问题。 CNN-LSTM 输入数据采用 4D 格式:[样本、子序列、时间步长、特征]。 出于某种原因,我需要TimeDistributed
Layers; 或者我收到类似ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 4, 8, 35]
ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 4, 8, 35]
。 我认为这与Conv1D
正式不适用于时间序列这一事实有关,因此为了保留时间序列数据形状,我们需要使用像TimeDistributed
这样的包装层。 我真的不介意使用 TimeDistributed 层 - 它们是包装器,如果它们让我的 model 工作我很高兴。 但是,当我尝试用
file = 'CNN_LSTM_Visualization.png'
tf.keras.utils.plot_model(model, to_file=file, show_layer_names=False, show_shapes=False)
生成的可视化仅显示Sequential()
:
我怀疑这与 TimeDistributed 层和 model 尚未构建有关。 我也不能调用model.summary()
- 它抛出ValueError: This model has not yet been built. Build the model first by calling
ValueError: This model has not yet been built. Build the model first by calling
build() or calling
fit() 来构建 model,或者argument in the first layer(s) for automatic build
with some data, or specify an
input_shape 参数以进行自动构建这很奇怪,因为我已经指定了 input_shape,尽管在Conv1D
层中而不是在TimeDistributed
包装器中。
我想要一个工作 model 和一个工作tf.keras.utils.plot_model
function。 关于为什么我需要 TimeDistributed 以及为什么它使 plot_model function 行为怪异的任何解释都会非常棒。
使用Input
层的替代方法是简单地将input_shape
传递给TimeDistributed
包装器,而不是Conv1D
层:
def DNN_Model(X_train):
model = Sequential()
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu'), input_shape=(n_subsequences, n_steps, X_train.shape[3])))
model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
model.add(TimeDistributed(Flatten()))
model.add(LSTM(100, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='mse', optimizer='adam')
return model
在开头添加您的输入层。 尝试这个
def DNN_Model(X_train):
model = Sequential()
model.add(InputLayer(input_shape=(n_subsequences, n_steps, X_train)))
model.add(TimeDistributed(
Conv1D(filters=n_filters, kernel_size=n_kernel,
activation='relu')))
model.add(TimeDistributed(Conv1D(filters=n_filters,
kernel_size=n_kernel, activation='relu')))
model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
....
现在,您可以 plot 并正确获取摘要。
DNN_Model(3).summary() # OK
tf.keras.utils.plot_model(DNN_Model(3)) # OK
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.