简体   繁体   English

来自keras的Model.train_on_batch和来自tensorflow的Session.run([train_optimizer])有什么区别?

[英]What is the difference between Model.train_on_batch from keras and Session.run([train_optimizer]) from tensorflow?

In the following Keras and Tensorflow implementations of the training of a neural network, how model.train_on_batch([x], [y]) in the keras implementation is different than sess.run([train_optimizer, cross_entropy, accuracy_op], feed_dict=feed_dict) in the Tensorflow implementation? 在以下Keras和Tensorflow实现的神经网络训练中,keras实现中的model.train_on_batch([x], [y])sess.run([train_optimizer, cross_entropy, accuracy_op], feed_dict=feed_dict)在Tensorflow实现中? In particular: how those two lines can lead to different computation in training?: 特别是:这两条线在训练中如何导致不同的计算?:

keras_version.py keras_version.py

input_x = Input(shape=input_shape, name="x")
c = Dense(num_classes, activation="softmax")(input_x)

model = Model([input_x], [c])
opt = Adam(lr)
model.compile(loss=['categorical_crossentropy'], optimizer=opt)

nb_batchs = int(len(x_train)/batch_size)

for epoch in range(epochs):
    loss = 0.0
    for batch in range(nb_batchs):
        x = x_train[batch*batch_size:(batch+1)*batch_size]
        y = y_train[batch*batch_size:(batch+1)*batch_size]

        loss_batch, acc_batch = model.train_on_batch([x], [y])

        loss += loss_batch
    print(epoch, loss / nb_batchs)

tensorflow_version.py tensorflow_version.py

input_x = Input(shape=input_shape, name="x")
c = Dense(num_classes)(input_x)

input_y = tf.placeholder(tf.float32, shape=[None, num_classes], name="label")
cross_entropy = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits_v2(labels=input_y, logits=c, name="xentropy"),
    name="xentropy_mean"
)
train_optimizer = tf.train.AdamOptimizer(learning_rate=lr).minimize(cross_entropy)

nb_batchs = int(len(x_train)/batch_size)

init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    for epoch in range(epochs):
        loss = 0.0
        acc = 0.0

        for batch in range(nb_batchs):
            x = x_train[batch*batch_size:(batch+1)*batch_size]
            y = y_train[batch*batch_size:(batch+1)*batch_size]

            feed_dict = {input_x: x,
                         input_y: y}
            _, loss_batch = sess.run([train_optimizer, cross_entropy], feed_dict=feed_dict)

            loss += loss_batch
        print(epoch, loss / nb_batchs)

Note: This question follows Same (?) model converges in Keras but not in Tensorflow , which have been considered too broad but in which I show exactly why I think those two statements are somehow different and lead to different computation. 注意:这个问题遵循相同(?)模型收敛于Keras但不在Tensorflow中 ,这被认为过于宽泛但我在其中明确说明为什么我认为这两个语句在某种程度上不同并导致不同的计算。

Yes, the results can be different. 是的,结果可能不同。 The results shouldn't be surprising if you know the following things in advance: 如果您事先知道以下事项,结果应该不会令人惊讶:

  1. Implementation of corss-entropy in Tensorflow and Keras is different. 在Tensorflow和Keras中实现corss-entropy是不同的。 Tensorflow assumes the input to tf.nn.softmax_cross_entropy_with_logits_v2 as the raw unnormalized logits while Keras accepts inputs as probabilities Tensorflow假定输入到tf.nn.softmax_cross_entropy_with_logits_v2作为原料非标准化logits而Keras接受输入作为概率
  2. Implementation of optimizers in Keras and Tensorflow are different. Keras和Tensorflow中optimizers实现是不同的。
  3. It might be the case that you are shuffling the data and the batches passed aren't in the same order. 可能是您正在洗牌数据并且传递的批次的顺序不同。 Although it doesn't matter if you run the model for long but initial few epochs can be entirely different. 虽然如果长时间运行模型并不重要,但最初的几个时期可能完全不同。 Make sure same batch is passed to both and then compare the results. 确保将同一批次传递给两者,然后比较结果。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 keras:如何在 model.train_on_batch() 中使用学习率衰减 - keras: how to use learning rate decay with model.train_on_batch() Keras:使用 model.train_on_batch() 和 model.fit() 获得不同的精度。 可能是什么原因以及如何解决? - Keras: Getting different accuracy using model.train_on_batch() and model.fit(). What could be the reason and how to fix that? 在 TensorFlow 中,Session.run() 和 Tensor.eval() 有什么区别? - In TensorFlow, what is the difference between Session.run() and Tensor.eval()? TensorFlow model fit 和 train_on_batch 之间的区别 - Difference between TensorFlow model fit and train_on_batch session.run上的Tensorflow ValueError与批处理培训 - Tensorflow ValueError on session.run with batch training train_on_batch() 在 keras 模型中做什么? - What does train_on_batch() do in keras model? 使用估算器训练Tensorflow模型(from_generator) - Train Tensorflow model with estimator (from_generator) TensorFlow Keras: tf.keras.ZA559B87068921EEC05086CE5485E978 函数比其他的train_batch 慢? - TensorFlow Keras: tf.keras.Model train_on_batch vs make_train_function - Why is one slower than the other? Keras中的x_train和x_test有什么区别? - What is the difference between x_train and x_test in Keras? sklearn.model_selection导入train_test_split和sklearn.cross_validation导入train_test_split有什么区别 - What is the difference between from sklearn.model_selection import train_test_split and from sklearn.cross_validation import train_test_split
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM