Tensorflow或Keras中的深度学习实现会产生截然不同的结果

Question

Context: I'm using a fully convolutional network to perform image segmentation. 背景： 我正在使用完全卷积网络来执行图像分割。 Typically, the input is an RGB image shape = [512, 256] and the target is a 2 channels binary mask defining the annotated regions (2nd channel is the opposite of the fist channel). 通常，输入是RGB图像shape = [512, 256] ，并且目标是定义注释区域的2通道二元掩模（第二通道与第一通道相反）。

Question: I have the same CNN implementation using Tensorflow and Keras. 问：我使用Tensorflow和Keras实现了相同的CNN实现。 But the Tensorflow model doesn't start learning. 但Tensorflow模型并没有开始学习。 Actually, the loss even grows with the number of epochs! 实际上， loss甚至随着时代的数量而增长！ What is wrong in this Tensorflow implementation that prevents it from learning? 这个Tensorflow实现有什么问题阻止它学习？

Setup: The dataset is split into 3 subsets: training (78%), testing (8%) and validation (14%) sets which are fed to the network by batches of 8 images. 设置：数据集分为3个子集：训练（78％），测试（8％）和验证（14％）集合，这些集合由8个图像批量馈送到网络。 The graphs show the evolution of the loss for each subsets. 图表显示了每个子集的loss演变。 The images show the prediction after 10 epoch for two different images. 图像显示了两个不同图像在10个时期之后的prediction 。

Tensorflow implementation and results Tensorflow实施和结果

import tensorflow as tf

tf.reset_default_graph()
x = inputs = tf.placeholder(tf.float32, shape=[None, shape[1], shape[0], 3])
targets = tf.placeholder(tf.float32, shape=[None, shape[1], shape[0], 2])

for d in range(4):
    x = tf.layers.conv2d(x, filters=np.exp2(d+4), kernel_size=[3,3], strides=[1,1], padding="SAME", activation=tf.nn.relu)
    x = tf.layers.max_pooling2d(x, strides=[2,2], pool_size=[2,2], padding="SAME")

x = tf.layers.conv2d(x, filters=2, kernel_size=[1,1])
logits = tf.image.resize_images(x, [shape[1], shape[0]], align_corners=True)
prediction = tf.nn.softmax(logits)

loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=targets, logits=logits))
optimizer = tf.train.RMSPropOptimizer(learning_rate=0.001).minimize(loss)

sess = tf.Session()
sess.run(tf.global_variables_initializer())

def run(mode, x_batch, y_batch):
    if mode == 'TRAIN':
        return sess.run([loss, optimizer], feed_dict={inputs: x_batch, targets: y_batch})
    else:
        return sess.run([loss, prediction], feed_dict={inputs: x_batch, targets: y_batch})

Keras implementation and reslults Keras实施和reslults

import keras as ke

ke.backend.clear_session()
x = inputs = ke.layers.Input(shape=[shape[1], shape[0], 3])

for d in range(4):
    x = ke.layers.Conv2D(int(np.exp2(d+4)), [3,3], padding="SAME", activation="relu")(x)
    x = ke.layers.MaxPool2D(padding="SAME")(x)

x = ke.layers.Conv2D(2, [1,1], padding="SAME")(x)
logits = ke.layers.Lambda(lambda x: ke.backend.tf.image.resize_images(x, [shape[1], shape[0]], align_corners=True))(x)
prediction = ke.layers.Activation('softmax')(logits)

model = ke.models.Model(inputs=inputs, outputs=prediction)
model.compile(optimizer="rmsprop", loss="categorical_crossentropy")

def run(mode, x_batch, y_batch):
    if mode == 'TRAIN':
        loss = model.train_on_batch(x=x_batch, y=y_batch)
        return loss, None
    else:
        loss = model.evaluate(x=x_batch, y=y_batch, batch_size=None, verbose=0)
        prediction = model.predict(x=x_batch, batch_size=None)
        return loss, prediction

There must be a difference between the two but my understanding of the documentation lead me nowhere. 两者之间必须存在差异，但我对文档的理解使我无处可去。 I would be really interested to know where the difference lies. 我真的很想知道差异在哪里。 Thanks in advance! 提前致谢！

Answer 1

The answer was in the Keras implementation of softmax where they subtract an unexpected max : 答案是在Kemax的softmax实现中减去意外的max ：

def softmax(x, axis=-1):
    # when x is a 2 dimensional tensor
    e = K.exp(x - K.max(x, axis=axis, keepdims=True))
    s = K.sum(e, axis=axis, keepdims=True)
    return e / s

Here is the Tensorflow implementation updated with the max hack and the good results associated 以下是使用max hack和相关的良好结果更新的Tensorflow实现

import tensorflow as tf

tf.reset_default_graph()
x = inputs = tf.placeholder(tf.float32, shape=[None, shape[1], shape[0], 3])
targets = tf.placeholder(tf.float32, shape=[None, shape[1], shape[0], 2])

for d in range(4):
    x = tf.layers.conv2d(x, filters=np.exp2(d+4), kernel_size=[3,3], strides=[1,1], padding="SAME", activation=tf.nn.relu)
    x = tf.layers.max_pooling2d(x, strides=[2,2], pool_size=[2,2], padding="SAME")

x = tf.layers.conv2d(x, filters=2, kernel_size=[1,1])
logits = tf.image.resize_images(x, [shape[1], shape[0]], align_corners=True)
# The misterious hack took from Keras
logits = logits - tf.expand_dims(tf.reduce_max(logits, axis=-1), -1)
prediction = tf.nn.softmax(logits)

loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=targets, logits=logits))
optimizer = tf.train.RMSPropOptimizer(learning_rate=0.001).minimize(loss)

sess = tf.Session()
sess.run(tf.global_variables_initializer())

def run(mode, x_batch, y_batch):
    if mode == 'TRAIN':
        return sess.run([loss, optimizer], feed_dict={inputs: x_batch, targets: y_batch})
    else:
        return sess.run([loss, prediction], feed_dict={inputs: x_batch, targets: y_batch})

Huge thanks to Simon for pointing this out on the Keras implementation :-) 非常感谢Simon指出Keras实施:-)

Tensorflow或Keras中的深度学习实现会产生截然不同的结果

问题描述

1 个解决方案

解决方案1
1 已采纳 2018-03-30 13:05:11

Tensorflow或Keras中的深度学习实现会产生截然不同的结果

问题描述

1 个解决方案

解决方案1 1 已采纳 2018-03-30 13:05:11

解决方案1
1 已采纳 2018-03-30 13:05:11