[英]What is the difference between Model.train_on_batch from keras and Session.run([train_optimizer]) from tensorflow?
In the following Keras and Tensorflow implementations of the training of a neural network, how model.train_on_batch([x], [y])
in the keras implementation is different than sess.run([train_optimizer, cross_entropy, accuracy_op], feed_dict=feed_dict)
in the Tensorflow implementation? 在以下Keras和Tensorflow实现的神经网络训练中,keras实现中的
model.train_on_batch([x], [y])
与sess.run([train_optimizer, cross_entropy, accuracy_op], feed_dict=feed_dict)
在Tensorflow实现中? In particular: how those two lines can lead to different computation in training?: 特别是:这两条线在训练中如何导致不同的计算?:
keras_version.py keras_version.py
input_x = Input(shape=input_shape, name="x")
c = Dense(num_classes, activation="softmax")(input_x)
model = Model([input_x], [c])
opt = Adam(lr)
model.compile(loss=['categorical_crossentropy'], optimizer=opt)
nb_batchs = int(len(x_train)/batch_size)
for epoch in range(epochs):
loss = 0.0
for batch in range(nb_batchs):
x = x_train[batch*batch_size:(batch+1)*batch_size]
y = y_train[batch*batch_size:(batch+1)*batch_size]
loss_batch, acc_batch = model.train_on_batch([x], [y])
loss += loss_batch
print(epoch, loss / nb_batchs)
tensorflow_version.py tensorflow_version.py
input_x = Input(shape=input_shape, name="x")
c = Dense(num_classes)(input_x)
input_y = tf.placeholder(tf.float32, shape=[None, num_classes], name="label")
cross_entropy = tf.reduce_mean(
tf.nn.softmax_cross_entropy_with_logits_v2(labels=input_y, logits=c, name="xentropy"),
name="xentropy_mean"
)
train_optimizer = tf.train.AdamOptimizer(learning_rate=lr).minimize(cross_entropy)
nb_batchs = int(len(x_train)/batch_size)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for epoch in range(epochs):
loss = 0.0
acc = 0.0
for batch in range(nb_batchs):
x = x_train[batch*batch_size:(batch+1)*batch_size]
y = y_train[batch*batch_size:(batch+1)*batch_size]
feed_dict = {input_x: x,
input_y: y}
_, loss_batch = sess.run([train_optimizer, cross_entropy], feed_dict=feed_dict)
loss += loss_batch
print(epoch, loss / nb_batchs)
Note: This question follows Same (?) model converges in Keras but not in Tensorflow , which have been considered too broad but in which I show exactly why I think those two statements are somehow different and lead to different computation. 注意:这个问题遵循相同(?)模型收敛于Keras但不在Tensorflow中 ,这被认为过于宽泛但我在其中明确说明为什么我认为这两个语句在某种程度上不同并导致不同的计算。
Yes, the results can be different. 是的,结果可能不同。 The results shouldn't be surprising if you know the following things in advance:
如果您事先知道以下事项,结果应该不会令人惊讶:
corss-entropy
in Tensorflow and Keras is different. corss-entropy
是不同的。 Tensorflow assumes the input to tf.nn.softmax_cross_entropy_with_logits_v2
as the raw unnormalized logits while Keras
accepts inputs as probabilities tf.nn.softmax_cross_entropy_with_logits_v2
作为原料非标准化logits而Keras
接受输入作为概率 optimizers
in Keras and Tensorflow are different. optimizers
实现是不同的。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.