[英]Deep Learning implementation in Tensorflow or Keras give drastic different results
Context: I'm using a fully convolutional network to perform image segmentation. 背景: 我正在使用完全卷积网络来执行图像分割。 Typically, the input is an RGB image shape = [512, 256]
and the target is a 2 channels binary mask defining the annotated regions (2nd channel is the opposite of the fist channel). 通常,输入是RGB图像shape = [512, 256]
,并且目标是定义注释区域的2通道二元掩模(第二通道与第一通道相反)。
Question: I have the same CNN implementation using Tensorflow and Keras. 问:我使用Tensorflow和Keras实现了相同的CNN实现。 But the Tensorflow model doesn't start learning. 但Tensorflow模型并没有开始学习。 Actually, the loss
even grows with the number of epochs! 实际上, loss
甚至随着时代的数量而增长! What is wrong in this Tensorflow implementation that prevents it from learning? 这个Tensorflow实现有什么问题阻止它学习?
Setup: The dataset is split into 3 subsets: training (78%), testing (8%) and validation (14%) sets which are fed to the network by batches of 8 images. 设置:数据集分为3个子集:训练(78%),测试(8%)和验证(14%)集合,这些集合由8个图像批量馈送到网络。 The graphs show the evolution of the loss
for each subsets. 图表显示了每个子集的loss
演变。 The images show the prediction
after 10 epoch for two different images. 图像显示了两个不同图像在10个时期之后的prediction
。
Tensorflow implementation and results Tensorflow实施和结果
import tensorflow as tf
tf.reset_default_graph()
x = inputs = tf.placeholder(tf.float32, shape=[None, shape[1], shape[0], 3])
targets = tf.placeholder(tf.float32, shape=[None, shape[1], shape[0], 2])
for d in range(4):
x = tf.layers.conv2d(x, filters=np.exp2(d+4), kernel_size=[3,3], strides=[1,1], padding="SAME", activation=tf.nn.relu)
x = tf.layers.max_pooling2d(x, strides=[2,2], pool_size=[2,2], padding="SAME")
x = tf.layers.conv2d(x, filters=2, kernel_size=[1,1])
logits = tf.image.resize_images(x, [shape[1], shape[0]], align_corners=True)
prediction = tf.nn.softmax(logits)
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=targets, logits=logits))
optimizer = tf.train.RMSPropOptimizer(learning_rate=0.001).minimize(loss)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
def run(mode, x_batch, y_batch):
if mode == 'TRAIN':
return sess.run([loss, optimizer], feed_dict={inputs: x_batch, targets: y_batch})
else:
return sess.run([loss, prediction], feed_dict={inputs: x_batch, targets: y_batch})
Keras implementation and reslults Keras实施和reslults
import keras as ke
ke.backend.clear_session()
x = inputs = ke.layers.Input(shape=[shape[1], shape[0], 3])
for d in range(4):
x = ke.layers.Conv2D(int(np.exp2(d+4)), [3,3], padding="SAME", activation="relu")(x)
x = ke.layers.MaxPool2D(padding="SAME")(x)
x = ke.layers.Conv2D(2, [1,1], padding="SAME")(x)
logits = ke.layers.Lambda(lambda x: ke.backend.tf.image.resize_images(x, [shape[1], shape[0]], align_corners=True))(x)
prediction = ke.layers.Activation('softmax')(logits)
model = ke.models.Model(inputs=inputs, outputs=prediction)
model.compile(optimizer="rmsprop", loss="categorical_crossentropy")
def run(mode, x_batch, y_batch):
if mode == 'TRAIN':
loss = model.train_on_batch(x=x_batch, y=y_batch)
return loss, None
else:
loss = model.evaluate(x=x_batch, y=y_batch, batch_size=None, verbose=0)
prediction = model.predict(x=x_batch, batch_size=None)
return loss, prediction
There must be a difference between the two but my understanding of the documentation lead me nowhere. 两者之间必须存在差异,但我对文档的理解使我无处可去。 I would be really interested to know where the difference lies. 我真的很想知道差异在哪里。 Thanks in advance! 提前致谢!
The answer was in the Keras implementation of softmax
where they subtract an unexpected max
: 答案是在Kemax的softmax
实现中减去意外的max
:
def softmax(x, axis=-1):
# when x is a 2 dimensional tensor
e = K.exp(x - K.max(x, axis=axis, keepdims=True))
s = K.sum(e, axis=axis, keepdims=True)
return e / s
Here is the Tensorflow implementation updated with the max
hack and the good results associated 以下是使用max
hack和相关的良好结果更新的Tensorflow实现
import tensorflow as tf
tf.reset_default_graph()
x = inputs = tf.placeholder(tf.float32, shape=[None, shape[1], shape[0], 3])
targets = tf.placeholder(tf.float32, shape=[None, shape[1], shape[0], 2])
for d in range(4):
x = tf.layers.conv2d(x, filters=np.exp2(d+4), kernel_size=[3,3], strides=[1,1], padding="SAME", activation=tf.nn.relu)
x = tf.layers.max_pooling2d(x, strides=[2,2], pool_size=[2,2], padding="SAME")
x = tf.layers.conv2d(x, filters=2, kernel_size=[1,1])
logits = tf.image.resize_images(x, [shape[1], shape[0]], align_corners=True)
# The misterious hack took from Keras
logits = logits - tf.expand_dims(tf.reduce_max(logits, axis=-1), -1)
prediction = tf.nn.softmax(logits)
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=targets, logits=logits))
optimizer = tf.train.RMSPropOptimizer(learning_rate=0.001).minimize(loss)
sess = tf.Session()
sess.run(tf.global_variables_initializer())
def run(mode, x_batch, y_batch):
if mode == 'TRAIN':
return sess.run([loss, optimizer], feed_dict={inputs: x_batch, targets: y_batch})
else:
return sess.run([loss, prediction], feed_dict={inputs: x_batch, targets: y_batch})
Huge thanks to Simon for pointing this out on the Keras
implementation :-) 非常感谢Simon指出Keras
实施:-)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.