简体   繁体   English

tensorflow在图执行期间添加摘要

[英]tensorflow add summary during graph execution

I want to output the accuracy every fixed amount of steps in the cifar10 example given by the tensorflow tutorial, I tried using the tf.summary.scalar(..) in the hook which produces the error: Graph is finalized. 我想在tensorflow教程给出的cifar10示例中输出每个固定数量步骤的精度,我尝试在钩子中使用tf.summary.scalar(..)产生错误:Graph已完成。 However, I think I can only access the number of steps in the hook (I'm evaluating the accuracy use cifar10_eval.py, also an example code given by the tensorflow tutorial). 但是,我想我只能访问钩子中的步骤数量(我正在评估准确度使用cifar10_eval.py,这也是tensorflow教程给出的示例代码)。 I also tried to write the global_step into the checkpoint but unfortunately the MointeredTrainingSession only supports the time interval( save_checkpoint_secs ) instead of step interval. 我还尝试将global_step写入检查点,但不幸的是, MointeredTrainingSession仅支持时间间隔( save_checkpoint_secs )而不是步进间隔。 Any suggestions? 有什么建议么?

cifar10_train.py cifar10_train.py

def train():
  """Train CIFAR-10 for a number of steps."""
  with tf.Graph().as_default():
    global_step = tf.contrib.framework.get_or_create_global_step()

    # Build a Graph that trains the model with one batch of examples and
    # updates the model parameters.
    train_op = cifar10.train(loss, global_step)

    class _LoggerHook(tf.train.SessionRunHook):
      """Logs loss and runtime."""

      def begin(self):
        self._step = -1
        self._start_time = time.time()

      def before_run(self, run_context):
        self._step += 1
        return tf.train.SessionRunArgs(loss)  # Asks for loss value.

      def after_run(self, run_context, run_values):
        <output some information>

    with tf.train.MonitoredTrainingSession(
        checkpoint_dir=FLAGS.train_dir,
        hooks=[tf.train.StopAtStepHook(last_step=FLAGS.max_steps),
               tf.train.NanTensorHook(loss),
               _LoggerHook()],
        config=tf.ConfigProto(
            log_device_placement=FLAGS.log_device_placement)) as mon_sess:
      while not mon_sess.should_stop():
        mon_sess.run(train_op)

First of all, it should be noted that cifar10 tutorial provided by Tensorflow runs training and evaluating on two separate sessions. 首先,应该注意的是,Tensorflow提供的cifar10教程在两个单独的会话中运行训练和评估。 When training session saves a checkpoint, evaluating session will retrieve this chekpoint. 当训练会话保存检查点时,评估会话将检索此chekpoint。 Then load parameters and perform evaluating. 然后加载参数并执行评估。 Code you paste here is just for training session. 您在此处粘贴的代码仅适用于培训课程。

My advise is, you should make it clear which summary are you going to write. 我的建议是,你应该明确你要写的摘要。 Because training and evaluating are two different sessions. 因为培训和评估是两个不同的会议。 There are two summary writers. 有两个摘要作者。 And usually, they will provide different paths for different summary writers. 通常,它们将为不同的摘要编写者提供不同的路径。

Based on your need, here are some hints for your project. 根据您的需要,以下是您项目的一些提示。

  • You should not write anything to checkpoint, which is bulk of model parameters. 你不应该向checkpoint写任何东西,checkpoint是大量的模型参数。
  • Please use summary writer or standard i/o for recording accuracy. 请使用摘要编写器或标准i / o来记录准确性。
  • You got an error when you tried to use summary writer because all elements of summary including scalar should be added before session is launched. 当您尝试使用摘要编写器时出现错误,因为在启动会话之前应添加包括标量的摘要的所有元素。

I guess Tensorflow regards summary as a part of default graph. 我猜Tensorflow将摘要视为默认图的一部分。 So you may want to configure your summary writer before you run a session. 因此,您可能希望在运行会话之前配置摘要编写器。

I got this error when I worked with CycleGAN before. 我之前使用CycleGAN时遇到了这个错误。 I solved this issue with these 2 lines, please add this before initializing the tf. 我用这两行解决了这个问题,请在初始化tf之前添加它。

import tensorflow as tf

tf.reset_default_graph()
tf.Graph().as_default()

I hope this will help you. 我希望这能帮到您。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM