簡體   English   中英

嘗試使用 tf.keras.utils.plot_model 時,帶有 TimeDistributed Layers 的 CNN-LSTM 表現異常

[英]CNN-LSTM with TimeDistributed Layers behaving weirdly when trying to use tf.keras.utils.plot_model

我有一個如下所示的 CNN-LSTM;

SEQUENCE_LENGTH = 32
BATCH_SIZE = 32
EPOCHS = 30
n_filters = 64
n_kernel = 1
n_subsequences = 4
n_steps = 8

def DNN_Model(X_train):
    model = Sequential()
    model.add(TimeDistributed(
        Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu', input_shape=(n_subsequences, n_steps, X_train.shape[3]))))
    model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))
    model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
    model.add(TimeDistributed(Flatten()))
    model.add(LSTM(100, activation='relu'))
    model.add(Dense(100, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='mse', optimizer='adam')
    return model

我正在使用這個 CNN-LSTM 來解決多變量時間序列預測問題。 CNN-LSTM 輸入數據采用 4D 格式:[樣本、子序列、時間步長、特征]。 出於某種原因,我需要TimeDistributed Layers; 或者我收到類似ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 4, 8, 35] ValueError: Input 0 of layer conv1d is incompatible with the layer: expected ndim=3, found ndim=4. Full shape received: [None, 4, 8, 35] 我認為這與Conv1D正式不適用於時間序列這一事實有關,因此為了保留時間序列數據形狀,我們需要使用像TimeDistributed這樣的包裝層。 我真的不介意使用 TimeDistributed 層 - 它們是包裝器,如果它們讓我的 model 工作我很高興。 但是,當我嘗試用

    file = 'CNN_LSTM_Visualization.png'
    tf.keras.utils.plot_model(model, to_file=file, show_layer_names=False, show_shapes=False)

生成的可視化僅顯示Sequential()

在此處輸入圖像描述

我懷疑這與 TimeDistributed 層和 model 尚未構建有關。 我也不能調用model.summary() - 它拋出ValueError: This model has not yet been built. Build the model first by calling ValueError: This model has not yet been built. Build the model first by calling build() or calling fit() 來構建 model,或者argument in the first layer(s) for automatic build with some data, or specify an input_shape 參數以進行自動構建這很奇怪,因為我已經指定了 input_shape,盡管在Conv1D層中而不是在TimeDistributed包裝器中。

我想要一個工作 model 和一個工作tf.keras.utils.plot_model function。 關於為什么我需要 TimeDistributed 以及為什么它使 plot_model function 行為怪異的任何解釋都會非常棒。

使用Input層的替代方法是簡單地將input_shape傳遞給TimeDistributed包裝器,而不是Conv1D層:

def DNN_Model(X_train):
    model = Sequential()
    model.add(TimeDistributed(
        Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu'), input_shape=(n_subsequences, n_steps, X_train.shape[3])))
    model.add(TimeDistributed(Conv1D(filters=n_filters, kernel_size=n_kernel, activation='relu')))
    model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
    model.add(TimeDistributed(Flatten()))
    model.add(LSTM(100, activation='relu'))
    model.add(Dense(100, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    model.compile(loss='mse', optimizer='adam')
    return model

在開頭添加您的輸入層。 嘗試這個

def DNN_Model(X_train):
    model = Sequential()
    model.add(InputLayer(input_shape=(n_subsequences, n_steps, X_train)))
    model.add(TimeDistributed(
        Conv1D(filters=n_filters, kernel_size=n_kernel,
               activation='relu')))
    model.add(TimeDistributed(Conv1D(filters=n_filters,
              kernel_size=n_kernel, activation='relu')))
    model.add(TimeDistributed(MaxPooling1D(pool_size=2)))
    ....

現在,您可以 plot 並正確獲取摘要。

DNN_Model(3).summary() # OK 
tf.keras.utils.plot_model(DNN_Model(3)) # OK

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM