ML model 適合訓練數據優於 model 適合發電機

Question

我的最終目標是通過將數據生成器輸入到 keras API 中的擬合方法來擬合 ML 自動編碼器。但是，我發現與生成器擬合的模型不如與原始數據本身擬合的模型。

為了證明這一點，我采取了以下步驟：

定義數據生成器以創建一組可變阻尼正弦波。 重要的是，我將生成器的批量大小定義為等於整個訓練數據集。 通過這種方式，我可以消除批量大小作為 model 與生成器擬合性能不佳的可能原因。
定義一個非常簡單的 ML 自動編碼器。 請注意，自動編碼器的潛在空間大於原始數據的大小，因此它應該學習如何相對快速地再現原始信號。
使用生成器訓練一個 model
使用生成器的__getitem__方法創建一組訓練數據，並使用這些數據來擬合相同的 ML model。

在生成器上訓練的 model 的結果遠不如在數據本身上訓練的結果。

我對生成器的表述一定是錯誤的，但對於我來說，我找不到我的錯誤。 作為參考，我模擬了此處和此處討論的生成器。

更新：

我簡化了問題，生成器不再生成一系列隨機參數化的阻尼正弦波，而是生成一個向量（即np.ones(batch_size, 1000, 1) 。我安裝了自動編碼器 model 並且和以前一樣, model 與生成器的擬合仍然低於原始數據本身的 model 擬合。

旁注：我編輯了最初發布的代碼以反映此更新。

import numpy as np
import matplotlib.pyplot as plt
import keras
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Conv1D, Conv1DTranspose, MaxPool1D
import tensorflow as tf


""" Generate training/testing data data (i.e., a vector of ones) """


class DataGenerator(keras.utils.Sequence):
    def __init__(
        self,
        batch_size,
        vector_length,
    ):
        self.batch_size = batch_size
        self.vector_length = vector_length

    def __getitem__(self, index):
        x = np.ones((self.batch_size, self.vector_length, 1))
        y = np.ones((self.batch_size, self.vector_length, 1))
        return x, y

    def __len__(self):
        return 1 #one batch of data


vector_length = 1000
train_gen = DataGenerator(800, vector_length)
test_gen = DataGenerator(200, vector_length)


""" Machine Learning Model and Training """

# Class to hold ML model
class MLModel:
    def __init__(self, n_inputs):
        self.n_inputs = n_inputs

        visible = Input(shape=n_inputs)
        encoder = Conv1D(
            filters=1,
            kernel_size=100,
            padding="same",
            strides=1,
            activation="LeakyReLU",
        )(visible)
        encoder = MaxPool1D(pool_size=2)(encoder)

        # decoder
        decoder = Conv1DTranspose(
            filters=1,
            kernel_size=100,
            padding="same",
            strides=2,
            activation="linear",
        )(encoder)

        model = Model(inputs=visible, outputs=decoder)
        model.compile(optimizer="adam", loss="mse")
        self.model = model


""" EXPERIMENT 1 """

# instantiate a model
n_inputs = (vector_length, 1)
model1 = MLModel(n_inputs).model

# train first model!
model1.fit(x=train_gen, epochs=10, validation_data=test_gen)

""" EXPERIMENT 2 """

# use the generator to create training and testing data
train_x, train_y = train_gen.__getitem__(0)
test_x, test_y = test_gen.__getitem__(0)

# instantiate a new model
model2 = MLModel(n_inputs).model

# train second model!
history = model2.fit(train_x, train_y, validation_data=(test_x, test_y), epochs=10)


""" Model evaluation and plotting """

pred_y1 = model1.predict(test_x)
pred_y2 = model2.predict(test_x)

plt.ion()
plt.clf()
n = 0
plt.plot(test_y[n, :, 0], label="True Signal")
plt.plot(pred_y1[n, :, 0], label="Model1 Prediction")
plt.plot(pred_y2[n, :, 0], label="Model2 Prediction")
plt.legend()

Answer 1

我犯了一個新手錯誤，忘記了model.fit默認為batch_size = 32 。 因此，上面發布的實驗沒有進行“同類”比較，因為 model 適合生成器使用batch_size=800而 model 適合數據本身使用batch_size=32 。 當為兩個實驗設置相同的批量大小時，兩個模型的表現相似。

ps 如果它對任何人都有幫助：我沒有意識到批量大小作為超參數的重要性。 當然有注意事項、細微差別和例外情況，但顯然較小的批次大小有助於概括 model。我不會重復這個主題，但這里、這里和這里有有趣的讀物

ML model 適合訓練數據優於 model 適合發電機

問題描述

更新：

1 個解決方案

解決方案1
0 2023-01-20 15:38:08

ML model 適合訓練數據優於 model 適合發電機

問題描述

更新：

1 個解決方案

解決方案1 0 2023-01-20 15:38:08

解決方案1
0 2023-01-20 15:38:08