簡體   English   中英

Keras-WGAN 評論家和生成器的准確度停留在 0

[英]Keras-WGAN Critic And Generator Accuracy Stuck At 0

我正在嘗試在 Keras 中實現 WGAN。 我正在使用 David Foster 的 Generative Deep Learning Book 和此代碼作為參考。 我寫下了這個簡單的代碼。 但是,每當我開始訓練 model 時,准確度始終為 0,Critic 和 Discriminator 的損失約為 0。

無論他們訓練多少個 epoch,他們都被困在這些數字上。 我嘗試了各種網絡配置和不同的超參數,但結果似乎沒有改變。 谷歌也沒有多大幫助。 我無法確定這種行為的根源。

這是我寫的代碼。


from os.path import expanduser
import os
import struct as st

import numpy as np
import matplotlib.pyplot as plt

from keras.datasets import mnist
from keras.layers import Input, Dense, Reshape, Flatten, Dropout
from keras.layers import BatchNormalization, Activation, ZeroPadding2D
from keras.layers.advanced_activations import LeakyReLU
from keras.layers.convolutional import UpSampling2D, Conv2D
from keras.models import Sequential, Model
from keras.optimizers import RMSprop
import keras.backend as K

def wasserstein_loss(y_true, y_pred):
    return K.mean(y_true * y_pred)

class WGAN:

    def __init__(self):

        # Data Params
        self.genInput=100
        self.imChannels=1
        self.imShape = (28,28,1)

        # Build Models
        self.onBuildDiscriminator()
        self.onBuildGenerator()
        self.onBuildGAN()

        pass

    def onBuildGAN(self):

        if self.mGenerator is None or self.mDiscriminator is None: raise Exception('Generator Or Descriminator Uninitialized.')

        self.mDiscriminator.trainable=False

        self.mGAN=Sequential()
        self.mGAN.add(self.mGenerator)
        self.mGAN.add(self.mDiscriminator)

        ganOptimizer=RMSprop(lr=0.00005)
        self.mGAN.compile(loss=wasserstein_loss, optimizer=ganOptimizer, metrics=['accuracy'])

        print('GAN Model')
        self.mGAN.summary()
        pass

    def onBuildGenerator(self):

        self.mGenerator=Sequential()

        self.mGenerator.add(Dense(128 * 7 * 7, activation="relu", input_dim=self.genInput))
        self.mGenerator.add(Reshape((7, 7, 128)))
        self.mGenerator.add(UpSampling2D())
        self.mGenerator.add(Conv2D(128, kernel_size=4, padding="same"))
        self.mGenerator.add(BatchNormalization(momentum=0.8))
        self.mGenerator.add(Activation("relu"))
        self.mGenerator.add(UpSampling2D())
        self.mGenerator.add(Conv2D(64, kernel_size=4, padding="same"))
        self.mGenerator.add(BatchNormalization(momentum=0.8))
        self.mGenerator.add(Activation("relu"))
        self.mGenerator.add(Conv2D(self.imChannels, kernel_size=4, padding="same"))
        self.mGenerator.add(Activation("tanh"))

        print('Generator Model')
        self.mGenerator.summary()
        pass

    def onBuildDiscriminator(self):

        self.mDiscriminator = Sequential()

        self.mDiscriminator.add(Conv2D(16, kernel_size=3, strides=2, input_shape=self.imShape, padding="same"))
        self.mDiscriminator.add(LeakyReLU(alpha=0.2))
        self.mDiscriminator.add(Dropout(0.25))
        self.mDiscriminator.add(Conv2D(32, kernel_size=3, strides=2, padding="same"))
        self.mDiscriminator.add(ZeroPadding2D(padding=((0,1),(0,1))))
        self.mDiscriminator.add(BatchNormalization(momentum=0.8))
        self.mDiscriminator.add(LeakyReLU(alpha=0.2))
        self.mDiscriminator.add(Dropout(0.25))
        self.mDiscriminator.add(Conv2D(64, kernel_size=3, strides=2, padding="same"))
        self.mDiscriminator.add(BatchNormalization(momentum=0.8))
        self.mDiscriminator.add(LeakyReLU(alpha=0.2))
        self.mDiscriminator.add(Dropout(0.25))
        self.mDiscriminator.add(Conv2D(128, kernel_size=3, strides=1, padding="same"))
        self.mDiscriminator.add(BatchNormalization(momentum=0.8))
        self.mDiscriminator.add(LeakyReLU(alpha=0.2))
        self.mDiscriminator.add(Dropout(0.25))
        self.mDiscriminator.add(Flatten())
        self.mDiscriminator.add(Dense(1))

        disOptimizer=RMSprop(lr=0.00005)
        self.mDiscriminator.compile(loss=wasserstein_loss, optimizer=disOptimizer, metrics=['accuracy'])

        print('Discriminator Model')
        self.mDiscriminator.summary()

        pass

    def fit(self, trainData, nEpochs=1000, batchSize=64):

        lblForReal = -np.ones((batchSize, 1))
        lblForGene = np.ones((batchSize, 1))

        for ep in range(1, nEpochs+1):

            for __ in range(5):

                # Get Valid Images
                validImages = trainData[ np.random.randint(0, trainData.shape[0], batchSize) ]

                # Get Generated Images
                noiseForGene=np.random.normal(0, 1, size=(batchSize, self.genInput))
                geneImages=self.mGenerator.predict(noiseForGene)

                # Train Critic On Valid And Generated Images With Labels -1 And 1 Respectively
                disValidLoss=self.mDiscriminator.train_on_batch(validImages, lblForReal)
                disGeneLoss=self.mDiscriminator.train_on_batch(geneImages, lblForGene)

                # Perform Critic Weight Clipping
                for l in self.mDiscriminator.layers:
                    weights = l.get_weights()
                    weights = [np.clip(w, -0.01, 0.01) for w in weights]
                    l.set_weights(weights)

            # Train Generator Using Combined Model
            geneLoss=self.mGAN.train_on_batch(noiseForGene, lblForReal)

            print(' Epoch', ep, 'Critic Valid Loss,Acc', disValidLoss, 'Critic Generated Loss,Acc', disGeneLoss, 'Generator Loss,Acc', geneLoss)
        pass

    pass

if __name__ == '__main__':
    (trainData, __), (__, __) = mnist.load_data()
    trainData = (trainData.astype(np.float32)/127.5) - 1
    trainData = np.expand_dims(trainData, axis=3)

    WGan = WGAN()
    WGan.fit(trainData)

對於我嘗試的所有配置,我得到的 output 與以下非常相似。


 Epoch 1 Critic Valid Loss,Acc [-0.00016362152, 0.0] Critic Generated Loss,Acc [0.0003417502, 0.0] Generator Loss,Acc [-0.00016735379, 0.0]
 Epoch 2 Critic Valid Loss,Acc [-0.0001719332, 0.0] Critic Generated Loss,Acc [0.0003365979, 0.0] Generator Loss,Acc [-0.00017250411, 0.0]
 Epoch 3 Critic Valid Loss,Acc [-0.00017473527, 0.0] Critic Generated Loss,Acc [0.00032945914, 0.0] Generator Loss,Acc [-0.00017612436, 0.0]
 Epoch 4 Critic Valid Loss,Acc [-0.00017181305, 0.0] Critic Generated Loss,Acc [0.0003266656, 0.0] Generator Loss,Acc [-0.00016987178, 0.0]
 Epoch 5 Critic Valid Loss,Acc [-0.0001683443, 0.0] Critic Generated Loss,Acc [0.00032702673, 0.0] Generator Loss,Acc [-0.00016638976, 0.0]
 Epoch 6 Critic Valid Loss,Acc [-0.00017005506, 0.0] Critic Generated Loss,Acc [0.00032805002, 0.0] Generator Loss,Acc [-0.00017040147, 0.0]
 Epoch 7 Critic Valid Loss,Acc [-0.00017353195, 0.0] Critic Generated Loss,Acc [0.00033711304, 0.0] Generator Loss,Acc [-0.00017537423, 0.0]
 Epoch 8 Critic Valid Loss,Acc [-0.00017059325, 0.0] Critic Generated Loss,Acc [0.0003263024, 0.0] Generator Loss,Acc [-0.00016974319, 0.0]
 Epoch 9 Critic Valid Loss,Acc [-0.00017530039, 0.0] Critic Generated Loss,Acc [0.00032463064, 0.0] Generator Loss,Acc [-0.00017845634, 0.0]
 Epoch 10 Critic Valid Loss,Acc [-0.00017530067, 0.0] Critic Generated Loss,Acc [0.00033131015, 0.0] Generator Loss,Acc [-0.00017526663, 0.0]

我遇到了類似的問題。 WGAN 的問題在於權重裁剪方法確實削弱了模型的學習能力。 學習可以很快飽和。 權重在每個 epoch 之后通過反向傳播進行更新,但隨后它們被剪裁。 我建議您將削波值更極端地進行試驗。 嘗試 [-1,1] 和 [-0.0001, 0.0001]。 你肯定會看到變化。 飽和示例: 100000 個 epoch 的 WGAN Critic 損失

如您所見,損失值在前幾百次迭代中達到 0.999975,然后在 100000 次迭代中完全沒有變化。 我嘗試使用不同的裁剪值進行試驗,損失值不同但行為相同。 當我嘗試 [-0.005, 0.005] 時,損失在 1 左右飽和,而 [-0.02, 0.02] 在 0.8 左右。

你的實現看起來是正確的,但有時在 GAN 中你能做的只有這么多。 所以我建議你嘗試使用梯度懲罰的 WGAN。 它有一個很好的方法來強制 K-Lipschitz 連續性,通過將插值圖像的 L2 范數固定為接近 1(查看論文)。 對於 WGAN-GP 中的評估,理想情況下,您應該看到評論家的損失值從某個較大的負數開始,然后收斂到 0。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM