简体   繁体   English

从我的 PettingZoo 环境中添加一个 TensorBoard 指标

[英]Add a TensorBoard metric from my PettingZoo environment

I'm using Tensorboard to see the progress of the PettingZoo environment that my agents are playing.我正在使用 Tensorboard 查看我的代理正在玩的 PettingZoo 环境的进度。 I can see the reward go up with time, which is good, but I'd like to add other metrics that are specific to my environment.我可以看到奖励随着时间的推移而增加,这很好,但我想添加其他特定于我的环境的指标。 ie I'd like TensorBoard to show me more charts with my metrics and how they improve over time.即我希望 TensorBoard 向我展示更多包含我的指标的图表以及它们如何随着时间的推移而改进。

The only way I could figure out how to do that was by inserting a few lines into the learn method of OnPolicyAlgorithm that's part of SB3.我能弄清楚如何做到这一点的唯一方法是将几行插入到OnPolicyAlgorithmlearn方法中,该方法是 SB3 的一部分。 This works and I got the charts I wanted:这行得通,我得到了我想要的图表:

(The two bottom charts are the ones I added.) (底部的两个图表是我添加的。)

But obviously editing library code isn't a good practice.但显然编辑库代码不是一个好习惯。 I should make the modifications in my own code, not in the libraries.我应该在自己的代码中进行修改,而不是在库中。 Is there currently a more elegant way to add a metric from my PettingZoo environment into TensorBoard?目前是否有更优雅的方式将我的 PettingZoo 环境中的指标添加到 TensorBoard 中?

You can add a callback to add your own logs.您可以添加回调以添加您自己的日志。 See the below example .请参见下面的示例 In this case the call back is called every step.在这种情况下,每一步都会调用回调。 There are other callbacks that you case use depending on your use case.根据您的用例,您还可以使用其他回调。

import numpy as np

from stable_baselines3 import SAC
from stable_baselines3.common.callbacks import BaseCallback

model = SAC("MlpPolicy", "Pendulum-v1", tensorboard_log="/tmp/sac/", verbose=1)


class TensorboardCallback(BaseCallback):
    """
    Custom callback for plotting additional values in tensorboard.
    """

    def __init__(self, verbose=0):
        super(TensorboardCallback, self).__init__(verbose)

    def _on_step(self) -> bool:
        # Log scalar value (here a random variable)
        value = np.random.random()
        self.logger.record('random_value', value)
        return True


model.learn(50000, callback=TensorboardCallback())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM