简体   繁体   English

Tensorflow卷积自动编码器

[英]Tensorflow Convolutional Autoencoder

I've been trying to implement a convolutional autoencoder in Tensorflow similar to how it was done in Keras in this tutorial . 我一直在尝试在Tensorflow中实现卷积自动编码器,类似于本教程中在Keras中进行的操作。

So far this is what my code looks like 到目前为止,这就是我的代码的样子

filter1 = tf.Variable(tf.random_normal([3, 3, 1, 16]))
filter2 = tf.Variable(tf.random_normal([3, 3, 16, 8]))
filter3 = tf.Variable(tf.random_normal([3, 3, 8, 8]))

d_filter1 = tf.Variable(tf.random_normal([3, 3, 8, 8]))
d_filter2 = tf.Variable(tf.random_normal([3, 3, 8, 8]))
d_filter3 = tf.Variable(tf.random_normal([3, 3, 8, 16]))
d_filter4 = tf.Variable(tf.random_normal([3, 3, 16, 1]))

def encoder(input_img):
    conv1 = tf.nn.relu(tf.nn.conv2d(input_img, filter1, strides=[1, 1, 1, 1], padding='SAME'))# [-1, 28, 28, 16]
    pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=2, strides=2, padding='SAME') # [-1, 14, 14, 16]
    conv2 = tf.nn.relu(tf.nn.conv2d(pool1, filter2, strides=[1, 1, 1, 1], padding='SAME')) # [-1, 14, 14, 8]
    pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=2, strides=2, padding='SAME') # [-1, 7, 7, 8]
    conv3 = tf.nn.relu(tf.nn.conv2d(pool2, filter3, strides=[1, 1, 1, 1], padding='SAME')) # [-1, 7, 7, 8]
    pool3 = tf.layers.max_pooling2d(inputs=conv3, pool_size=2, strides=2, padding='SAME') # [-1, 4, 4, 8]

    return pool3

def decoder(encoded):
    d_conv1 = tf.nn.relu(tf.nn.conv2d(encoded, d_filter1, strides=[1, 1, 1, 1], padding='SAME')) # [-1, 4, 4, 8]
    d_pool1 = tf.keras.layers.UpSampling2D((2, 2))(d_conv1) # [-1, 8, 8, 8]
    d_conv2 = tf.nn.relu(tf.nn.conv2d(d_pool1, d_filter2, strides=[1, 1, 1, 1], padding='SAME')) # [-1, 8, 8, 8]
    d_pool2 = tf.keras.layers.UpSampling2D((2, 2))(d_conv2) # [-1, 16, 16, 8]
    d_conv3 = tf.nn.relu(tf.nn.conv2d(d_pool2, d_filter3, strides=[1, 1, 1, 1], padding='VALID')) # [-1, 14, 14, 16]
    d_pool3 = tf.keras.layers.UpSampling2D((2, 2))(d_conv3) # [28, 28, 16]
    decoded = tf.nn.sigmoid(tf.nn.conv2d(d_pool3, d_filter4, strides=[1, 1, 1, 1], padding='SAME')) # [-1, 28, 28, 1]

    return decoded

x = tf.placeholder(tf.float32, [None, 28, 28, 1])
encoded = encoder(x)
decoded = decoder(mid)
autoencoder = decoder(encoded)
loss = tf.reduce_mean(tf.keras.losses.binary_crossentropy(y_true=x, y_pred=autoencoder))
optimizer = tf.train.AdadeltaOptimizer(learning_rate=0.1).minimize(loss)
batch_size = 128
epochs = 50

saver = tf.train.Saver()

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    num_batches = int(x_train.shape[0]/batch_size)
    for epoch in range(epochs):
        avg_epoch_loss = 0.0
        for k in range(num_batches):
            batch_x = x_train[k*batch_size:k*batch_size+batch_size]
            feed_dict = {x: batch_x.reshape([-1, 28, 28, 1])}
            _, l = sess.run([optimizer, loss], feed_dict=feed_dict)
            avg_epoch_loss += l

            if k % 100 == 0:
                print 'Step {}/{} of epoch {}/{} completed with loss {}'.format(k, num_batches, epoch, epochs, l)

        avg_epoch_loss /= num_batches
        print 'Epoch {}/{} completed with average loss {}'.format(epoch, epochs, avg_epoch_loss)
        saver.save(sess=sess, save_path='./model.ckpt')

        img = sess.run(autoencoder, feed_dict={x: x_test[0].reshape([1, 28, 28, 1])}).reshape(28, 28)
        plt.imshow(img, cmap='gray')
        plt.show()

When I train this the loss value tends to go down but then stays around the same (high) value. 当我训练这个时,损耗值趋于下降,但随后保持在相同(高)值附近。 However, when I replace the encoder and decoder functions with this, which uses the Keras methods from the above link, the loss decreases at a reasonable rate and converges to a low value. 但是,当我用上面链接中的Keras方法替换encoderdecoder功能时,损耗以合理的速率降低,并收敛到较低的值。

def encoder(input_img):
    Conv2D(16, (3, 3), activation='relu', padding='same')(input_img)
    x = MaxPooling2D((2, 2), padding='same')(x)
    x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
    x = MaxPooling2D((2, 2), padding='same')(x)
    x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
    encoded = MaxPooling2D((2, 2), padding='same')(x)

    return encoded

def decoder(encoded):
    x = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
    x = UpSampling2D((2, 2))(x)
    x = Conv2D(8, (3, 3), activation='relu', padding='same')(x)
    x = UpSampling2D((2, 2))(x)
    x = Conv2D(16, (3, 3), activation='relu')(x)
    x = UpSampling2D((2, 2))(x)
    decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(x)

    return decoded

I'm trying to figure out what the difference is between these two methods, I've looked over it several times and it seems like my method should be doing the exact same thing as the Keras method. 我试图弄清楚这两种方法之间的区别,我已经看了几次,看来我的方法应该做与Keras方法完全相同的事情。 Any help in figuring out what's going on would be appreciated! 任何帮助找出正在发生的事情将不胜感激!

The one simple obvious problem in your code is you are not initializing your filters correctly. 代码中一个简单的明显问题就是您没有正确初始化过滤器。 Try the following, it might work. 请尝试以下操作,它可能会起作用。 You can also try some other sophisticated initialization scheme such as Xavier Inititalizer 您还可以尝试其他一些复杂的初始化方案,例如Xavier Inititalizer

filter1 = tf.Variable(tf.random_normal([3, 3, 1, 16], mean=0.0, std=0.01))
filter2 = tf.Variable(tf.random_normal([3, 3, 16, 8], mean=0.0, std=0.01))
filter3 = tf.Variable(tf.random_normal([3, 3, 8, 8], mean=0.0, std=0.01))

d_filter1 = tf.Variable(tf.random_normal([3, 3, 8, 8], mean=0.0, std=0.01))
d_filter2 = tf.Variable(tf.random_normal([3, 3, 8, 8], mean=0.0, std=0.01))
d_filter3 = tf.Variable(tf.random_normal([3, 3, 8, 16], mean=0.0, std=0.01))
d_filter4 = tf.Variable(tf.random_normal([3, 3, 16, 1], mean=0.0, std=0.01))

A difference between your version and the Keras code you point at is the learning rate to the optimizer. 您的版本与您所指向的Keras代码之间的区别是对优化器的学习速度。 Different learning rates can lead to quite different outcomes (convergence, divergence, instabilities). 不同的学习率可能导致完全不同的结果(收敛,发散,不稳定)。

The tutorial uses a default learning rate valued at 1.0. 本教程使用的默认学习率为 1.0。 Your version uses the same optimizer (AdaDelta), but you set the learning rate at 0.1. 您的版本使用相同的优化器(AdaDelta),但将学习率设置为0.1。

What if you set the same value? 如果您设置相同的值怎么办? (It may be useful to check that rho, epsilon, and the decay have the same values too). (检查rho,ε和衰减是否也具有相同的值可能很有用)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM