Tensorflow CNN does not learn (image in - image out)

Question

i'm stuck working on a Tensorflow Convolutional Neural Network for a university project and i hope somebody can help me.
it's supposed to output a picture for a picture input. left is input, right is output. both are in .jpeg format.

input and output

The weights look like this. left image shows the weights before learning, right is after a few epochs and it does not change at all with further training.
The net does not seem to learn anything useful and i have a feeling i forgot something basic. the accuracy peeks around 5% when learning

weights

here is what it looks when i save the input image x
i dont know if i make a mistake loading or saving the image

And this is what the output y of the net looks like

i based the code on the tensorflow mnist tutorial. here is my code that i have shortened to make it more readable:

import tensorflow as tf
from PIL import Image
import numpy as np

def weight_variable(dim,stddev=0.35):
    init = tf.random_normal(dim, stddev=stddev)
    return tf.Variable(init)

def bias_variable(dim,val=0.1):
    init = tf.constant(val, shape=dim)
    return tf.Variable(init)

def conv2d(x,W):
    return tf.nn.conv2d(x, W, strides=[1,1,1,1], padding = 'SAME')

def max_pool2x2(x):
    return tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding = 'SAME')

def output_pics(pic): # for weights
    #1 color (dimension) array cast to uint8 and output as jpeg to file

def output_pics_color(pic):
    #3 colors (dimensions) array cast to uint8 and output as jpeg to file

def show_pic(pic):
    #3 colors (dimensions) array cast to uint8 and shown in window



filesX = [...] # filenames of inputs for training
filesY = [...] # filenames of outputsfor training
test_filesX = [...]# filenames of inputs for testing
test_filesY = [...]# filenames of outputs for testing
px_size = 128 # size of images 128x128 (resized)


filename_queueX = tf.train.string_input_producer(filesX)
filename_queueY = tf.train.string_input_producer(filesY)
filename_testX = tf.train.string_input_producer(test_filesY)
filename_testY = tf.train.string_input_producer(test_filesY)

image_reader = tf.WholeFileReader()
img_name, img_dataX = image_reader.read(filename_queueX)
imageX = tf.image.decode_jpeg(img_dataX)
imageX = tf.image.resize_images(imageX, [px_size,px_size])
imageX.set_shape((px_size,px_size,3))
imageX=tf.cast(imageX, tf.float32)

...
same for imageY, test_imageX, test_imageY

trainX = []
trainY = []
testX = []
testY = []
j=1


with tf.name_scope('model'):
    x=tf.placeholder(tf.float32, [None, px_size,px_size,3])
    prob = tf.placeholder(tf.float32)

    init_op = tf.global_variables_initializer()

    # load images into lists
    with tf.Session() as sess:
        coord = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(coord=coord)
        for i in range(1,65):
            trainX.append(imageX.eval())
            trainY.append(imageY.eval())
        for i in range(1, 10):
            testX.append(test_imageX.eval())
            testY.append(test_imageY.eval())
        coord.request_stop()
        coord.join(threads)    

    # layer 1 
    x_img = tf.reshape(x,[-1,px_size,px_size, 3])    
    W1 = weight_variable([20,20,3,3])
    b1 = bias_variable([3])                       
    y1 = tf.nn.softmax(conv2d(x_img,W1)+b1)

    # layer 2
    W2 = weight_variable([30,30,3,3])
    b2 = bias_variable([3])
    y2=tf.nn.softmax(conv2d(y1, W2)+b2)

    # layer 3
    W3 = weight_variable([40,40,3,3])
    b3 = bias_variable([3])
    y3=tf.nn.softmax(conv2d(y2, W3)+b3)


    y = y3

    with tf.name_scope('train'):
        y_ =tf.placeholder(tf.float32, [None, px_size,px_size,3])
        cross_entropy = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=y_, logits=y))
        opt = tf.train.MomentumOptimizer(learning_rate=0.5, momentum=0.1).minimize(cross_entropy)

    with tf.name_scope('eval'):
        correct = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
        accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))

        nEpochs = 1000
        batchSize = 10
        res = 0
        with tf.Session() as sess:
            init = tf.global_variables_initializer()
            sess.run(init)
            trAccs = []
            for i in range(nEpochs):
                if i%100 == 0 :
                    train_accuracy = sess.run(accuracy, feed_dict={x:trainX, y_:trainY, prob: 1.0})
                    print(train_accuracy)
                    output_pics(W1)#output weights of layer 1 to file
                    output_pics_color(x)#save input image
                    output_pics_color(y)#save net output
                    sess.run(opt, feed_dict={x:trainX, y_:trainY, prob: 0.5})

Answer 1

This is an Image generation problem
The model you selected is a very bad model for Image generation tasks
Normal CNNs are used for image recognition and object detection tasks
The tutorial on MNIST is image classification problem and not image generation problem
It is very important to select an appropriate model type for a particular problem
Clearly with this model there is no chance of achieving the output that you have mentioned
I do not event understand that how are you even calculating the accuracy because this is unsupervised learning problem
You have used softmax after every layer which is really a bad idea.. Tensorflow mnist tutorial does not even has this code
Softmax is only used in the last layer
In the hidden layers leaky relu or simple relu should be used
I would suggest you to look for a more appropriate deep-learning model
Specifically combination of Variational Auto-Encoder Generative Adversarial Networks or simple Generative Adversarial Networks

Tensorflow CNN does not learn (image in - image out)

Question

1 answers

solution1
0 ACCPTED 2018-03-09 08:51:42

Tensorflow CNN does not learn (image in - image out)

Question

1 answers

solution1 0 ACCPTED 2018-03-09 08:51:42

solution1
0 ACCPTED 2018-03-09 08:51:42