简体   繁体   English

在张量板上记录训练和验证损失

[英]Logging training and validation loss in tensorboard

I'm trying to learn how to use tensorflow and tensorboard. 我正在尝试学习如何使用tensorflow和tensorboard。 I have a test project based on the MNIST neural net tutorial . 我有一个基于MNIST神经网络教程的测试项目。

In my code, I construct a node that calculates the fraction of digits in a data set that are correctly classified, like this: 在我的代码中,我构造了一个节点,该节点计算正确分类的数据集中的数字分数,如下所示:

correct = tf.nn.in_top_k(self._logits, labels, 1)
correct = tf.to_float(correct)
accuracy = tf.reduce_mean(correct)

Here, self._logits is the inference part of the graph, and labels is a placeholder that contains the correct labels. 在这里, self._logits是图的推理部分,而labels是包含正确标签的占位符。

Now, what I would like to do is evaluate the accuracy for both the training set and the validation set as training proceeds. 现在,我想做的是随着训练的进行,评估训练集和验证集的准确性。 I can do this by running the accuracy node twice, with different feed_dicts: 我可以通过使用不同的feed_dicts运行两次准确性节点来做到这一点:

train_acc = tf.run(accuracy, feed_dict={images : training_set.images, labels : training_set.labels})
valid_acc = tf.run(accuracy, feed_dict={images : validation_set.images, labels : validation_set.labels})

This works as intended. 这按预期工作。 I can print the values, and I can see that initially, the two accuracies will both increase, and eventually the validation accuracy will flatten out while the training accuracy keeps increasing. 我可以打印这些值,并且可以看到最初两个精度都将提高,最终验证精度将趋于平稳,而训练精度将不断提高。

However, I would also like to get graphs of these values in tensorboard, and I can not figure out how to do this. 但是,我也想在张量板上获得这些值的图形,而且我不知道该怎么做。 If I simply add a scalar_summary to accuracy , the logged values will not distinguish between training set and validation set. 如果我只是一个补充scalar_summaryaccuracy ,记录的值不会训练集和验证集区分。

I also tried creating two identical accuracy nodes with different names and running one on the training set and one on the validation set. 我还尝试创建两个具有不同名称的相同accuracy节点,并在训练集上运行一个,在验证集上运行一个。 I then add a scalar_summary to each of these nodes. 然后,我向每个这些节点添加一个scalar_summary This does give me two graphs in tensorboard, but instead of one graph showing the training set accuracy and one showing the validation set accuracy, they are both showing identical values that do not match either of the ones printed to the terminal. 这确实给了我张量板上的两个图,但是与其一个显示训练集精度的图和一个显示验证集精度的图,它们都显示了与打印到终端的值都不匹配的相同值。

I am probably misunderstanding how to solve this problem. 我可能误会了如何解决这个问题。 What is the recommended way of separately logging the output from a single node for different inputs? 建议单独记录单个节点针对不同输入的输出的方法是什么?

There are several different ways you could achieve this, but you're on the right track with creating different tf.summary.scalar() nodes. 有几种不同的方法可以实现此目的,但是通过创建不同的tf.summary.scalar()节点,您可以走上正确的路。 Since you must explicitly call SummaryWriter.add_summary() each time you want to log a quantity to the event file, the simplest approach is probably to fetch the appropriate summary node each time you want to get the training or validation accuracy: 因为每次要将数量记录到事件文件中时都必须显式调用SummaryWriter.add_summary() ,所以最简单的方法可能是每次希望获得训练或验证准确性时都获取适当的Summary节点:

accuracy = tf.reduce_mean(correct)

training_summary = tf.summary.scalar("training_accuracy", accuracy)
validation_summary = tf.summary.scalar("validation_accuracy", accuracy)


summary_writer = tf.summary.FileWriter(...)

for step in xrange(NUM_STEPS):

  # Perform a training step....

  if step % LOG_PERIOD == 0:

    # To log training accuracy.
    train_acc, train_summ = sess.run(
        [accuracy, training_summary], 
        feed_dict={images : training_set.images, labels : training_set.labels})
    writer.add_summary(train_summ, step) 

    # To log validation accuracy.
    valid_acc, valid_summ = sess.run(
        [accuracy, validation_summary],
        feed_dict={images : validation_set.images, labels : validation_set.labels})
    writer.add_summary(valid_summ, step)

Alternatively, you could create a single summary op whose tag is a tf.placeholder(tf.string, []) and feed the string "training_accuracy" or "validation_accuracy" as appropriate. 另外,您可以创建一个标签为tf.placeholder(tf.string, [])摘要操作,并根据需要输入字符串"training_accuracy""validation_accuracy"

Another way to do it, is to use a second file writer. 另一种方法是使用第二个文件编写器。 So you are able to use the merge_summaries command. 因此,您可以使用merge_summaries命令。

train_writer = tf.summary.FileWriter(FLAGS.summaries_dir + '/train',
                                      sess.graph)
test_writer = tf.summary.FileWriter(FLAGS.summaries_dir + '/test')
tf.global_variables_initializer().run()

Here is the complete documentation. 这是完整的文档。 This works for me fine : TensorBoard: Visualizing Learning 这对我来说很好: TensorBoard:Visualizing Learning

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM