简体   繁体   English

使用带有 DQN 算法的张量板

[英]Using tensorboard with a DQN algorithm

For reinforcement learning I have read that tensorboard isn't ideal since it gives the input of per episode and/or step.对于强化学习,我读到张量板并不理想,因为它提供了每集和/或步骤的输入。 Since in reinforcement learning there are thousands of steps, it doesn't give us an overview of the content.由于在强化学习中有数千个步骤,它并没有给我们提供内容的概述。 I saw this modified tensorboard class here: https://pythonprogramming.net/deep-q-learning-dqn-reinforcement-learning-python-tutorial/我在这里看到了这个修改过的张量板 class: https://pythonprogramming.net/deep-q-learning-dqn-reinforcement-learning-python-tutorial/

the class: class:

class ModifiedTensorBoard(TensorBoard):
    # Overriding init to set initial step and writer (we want one log file for all .fit() calls)
    def __init__(self, name, **kwargs):
        super().__init__(**kwargs)
        self.step = 1
        self.writer = tf.summary.create_file_writer(self.log_dir)
        self._log_write_dir = os.path.join(self.log_dir, name)

    # Overriding this method to stop creating default log writer
    def set_model(self, model):
        pass

    # Overrided, saves logs with our step number
    # (otherwise every .fit() will start writing from 0th step)
    def on_epoch_end(self, epoch, logs=None):
        self.update_stats(**logs)

    # Overrided
    # We train for one batch only, no need to save anything at epoch end
    def on_batch_end(self, batch, logs=None):
        pass

    # Overrided, so won't close writer
    def on_train_end(self, _):
        pass

    def on_train_batch_end(self, batch, logs=None):
        pass

    # Custom method for saving own metrics
    # Creates writer, writes custom metrics and closes writer
    def update_stats(self, **stats):
        self._write_logs(stats, self.step)

    def _write_logs(self, logs, index):
        with self.writer.as_default():
            for name, value in logs.items():
                tf.summary.scalar(name, value, step=index)
                self.step += 1
                self.writer.flush()

and I would like to make it work with this layer:我想让它与这一层一起工作:

n_actions = env.action_space.n
input_dim = env.observation_space.n
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(20, input_dim = input_dim , activation = 'relu'))#32
model.add(tf.keras.layers.Dense(10, activation = 'relu'))#10
model.add(tf.keras.layers.Dense(n_actions, activation = 'linear'))
model.compile(optimizer=tf.keras.optimizers.Adam(), loss = 'mse')

But I have yet to get it to work.但我还没有让它工作。 Anyone who has worked with tensorboard before, do you know how to setup this up?任何曾经使用过 tensorboard 的人,你知道如何设置它吗? Any insight is greatly appreciated.非常感谢任何见解。

I use tensorboard always during training of RL algorithms without any modified code like above.我总是在训练 RL 算法期间使用 tensorboard,而无需像上面那样修改任何代码。 Simply initiate your writer:只需启动您的作家:

writer = tf.summary.create_file_writer(logdir=log_folder)

Start your code with:开始你的代码:

with writer.as_default():
    ... do everythng indented inside here 

And eg if you want to save you reward or the weights of your first layer to tensorboard every 100 steps just do:例如,如果您想将奖励或第一层的权重保存到 tensorboard 每 100 步,只需执行以下操作:

if step % 100 = 0:
    tf.summary.scalar(name="reward", data=reward, step=step)
    dqn_variable = model.trainable_variables
    tf.summary.histogram(name="dqn_variables", data=tf.convert_to_tensor(dqn_variable[0]), step=step)
    writer.flush()

That should do the trick:)这应该够了吧:)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM