簡體   English   中英

張量流中的自動編碼器。 損失不減

[英]Autocoder in tensorflow. The loss does not go down

我不明白為什么損失沒有改變。 我嘗試過的事情:根據維度公式改變編碼和解碼的層數,改變學習率,改變優化函數,將兩個批次作為無噪聲圖像提供,改變批次大小,檢查輸入的有效性。 下面提供了輸出示例。 這是整個代碼。

我對 TensorFlow 比較陌生,它可能很愚蠢。

import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data
%matplotlib inline
import matplotlib.pyplot as plt

#Network
tf.reset_default_graph()

noise_imgs = tf.placeholder(tf.float32, [None, 28, 28, 1])
imgs = tf.placeholder(tf.float32, [None, 28, 28, 1])

# Building the encoder
def encoder(x):
   out1 = tf.layers.conv2d(x, 32, [3, 3], padding="valid", activation=tf.nn.relu) #26*26*32 
   out1pool = tf.layers.max_pooling2d(inputs=out1, pool_size=[2, 2], strides=2) #13*13*32
   out2 = tf.layers.conv2d(out1pool, 64, [3, 3], padding="valid", activation=tf.nn.relu) #11*11*64
   out1pool2 = tf.layers.max_pooling2d(inputs=out2, pool_size=[2, 2], strides=2) #5*5*64

   flat_inputs = tf.contrib.layers.flatten(out1pool2)
   hundred = tf.layers.dense(flat_inputs, units=100)
   return hundred

   # Building the decoder
def decoder(x):
   img = tf.reshape(x, [-1, 10, 10, 1])

   l1 = tf.layers.conv2d_transpose(img, 32, [7, 7], padding="valid", activation=tf.nn.relu)
   l2 = tf.layers.conv2d_transpose(l1, 1, [13, 13], padding="valid", activation=tf.nn.relu)          
   return l2 

   # Construct model
   encoder_op = encoder(noise_imgs)
   decoder_op = decoder(encoder_op)

loss = tf.sqrt(tf.reduce_sum(tf.square(imgs-decoder_op)))
optim = tf.train.AdamOptimizer(learning_rate = 0.001).minimize(loss)


# Start Training
noise_constant=0.2
num_iter = 1000
batch_size = 128

sess = tf.InteractiveSession()
tf.global_variables_initializer().run()

# Training
for i in range(num_iter):  
    batch_x, _ = data.train.next_batch(batch_size)

    #shape (64, 784)
    batch = batch_x.reshape([batch_size, 28, 28, 1])

    noise_matrix = noise_constant * np.random.randn(batch_size, 784)
    noise_matrix = noise_matrix.reshape([batch_size, 28, 28, 1])

    batch_img_noise = batch_x + noise_matrix
    batch_img_noise = batch_img_noise.reshape([64, 28, 28, 1])


    # Run optimization op (backprop) and cost op (to get loss value)
     _, l = sess.run([optim, L], feed_dict={noise_imgs: batch_img_noise , imgs: batch})
   print(l)

輸出: 152.3966 152.28357 152.38466 152.44324 152.20834 152.43982 152.36153 152.38193 152.28334 1652.4 1852.4 1852.456

  #Network
tf.reset_default_graph()

noise_imgs = tf.placeholder(tf.float32, [None, 28, 28, 1])
imgs = tf.placeholder(tf.float32, [None, 28, 28, 1])

# Building the encoder
def encoder(x):
   out1 = tf.layers.conv2d(x, 32, [3, 3], padding="valid", activation=tf.nn.relu) #26*26*32 
   out1pool = tf.layers.max_pooling2d(inputs=out1, pool_size=[2, 2], strides=2) #13*13*32
   out2 = tf.layers.conv2d(out1pool, 64, [3, 3], padding="valid", activation=tf.nn.relu) #11*11*64
   out1pool2 = tf.layers.max_pooling2d(inputs=out2, pool_size=[2, 2], strides=2) #5*5*64

   flat_inputs = tf.contrib.layers.flatten(out1pool2)
   hundred = tf.layers.dense(flat_inputs, units=100)
   return hundred

   # Building the decoder
def decoder(x):
   img = tf.reshape(x, [-1, 10, 10, 1])

   l1 = tf.layers.conv2d_transpose(img, 32, [7, 7], padding="valid", activation=tf.nn.relu)
   l2 = tf.layers.conv2d_transpose(l1, 1, [13, 13], padding="valid", activation=tf.nn.relu)          
   return l2 

   # Construct model
encoder_op = encoder(noise_imgs)
decoder_op = decoder(encoder_op)

loss = tf.sqrt(tf.reduce_sum(tf.square(imgs-decoder_op)))
optim = tf.train.AdamOptimizer(learning_rate = 0.001).minimize(loss)


# Start Training
noise_constant=0.2
num_iter = 1000
batch_size = 128

sess = tf.InteractiveSession()
tf.global_variables_initializer().run()

# Training
for i in range(num_iter):  
    batch_x, _ = mnist.train.next_batch(batch_size)

    #shape (64, 784)
    batch = batch_x.reshape([batch_size, 28, 28, 1])

    noise_matrix = noise_constant * np.random.randn(batch_size, 784)
    noise_matrix = noise_matrix.reshape([batch_size, 28, 28, 1])

    batch_img_noise = batch + noise_matrix
    batch_img_noise = batch_img_noise.reshape([batch_size, 28, 28, 1])


    # Run optimization op (backprop) and cost op (to get loss value)
    _, l = sess.run([optim, loss], feed_dict={noise_imgs: batch_img_noise , imgs: batch})
    if i % 100 == 0:
        print("Iter", i, ":", l)

我對您的代碼進行了一些更改,例如將L替換為loss ,以使其在我的本地計算機上運行。

它確實收斂:

Iter 0 : 105.12259
Iter 100 : 58.750557
Iter 200 : 46.29199
Iter 300 : 43.19689
Iter 400 : 39.70022
Iter 500 : 38.924805
Iter 600 : 36.81252
Iter 700 : 36.478275
Iter 800 : 37.10568
Iter 900 : 36.200474

您可以使用matplotlib.pyplot將解碼器的輸出可視化以進行完整性檢查。 我做到了,而且有效。

但是,您可能需要考慮將損失從tf.sqrt(tf.reduce_sum(tf.square(imgs-decoder_op)))更改為tf.reduce_sum(tf.square(imgs-decoder_op))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM