为什么在创建自定义环境时使用 OpenAI Gym 的 Env class 而不是什么都不使用？

Question

This is a general question on the advantages of using gym.Env as superclass (as opposed to nothing):这是关于使用gym.Env 作为超类（而不是什么都没有）的优势的一般问题：

I am thinking of building my own reinforcement learning environment for a small experiment.我正在考虑为一个小型实验构建自己的强化学习环境。 I have read a couple of blog posts on how to build one with the Env class from the OpenAI Gym package (for example https://medium.com/@apoddar573/making-your-own-custom-environment-in-gym-c3b65ff8cdaa ). I have read a couple of blog posts on how to build one with the Env class from the OpenAI Gym package (for example https://medium.com/@apoddar573/making-your-own-custom-environment-in-gym- c3b65ff8cdaa )。 But it seems like I can create an environmnet without needing to use the class at all.但似乎我可以创建一个环境，而根本不需要使用 class。 Eg if I wanted to create an env called Foo, the tutorials recommend I use something like例如，如果我想创建一个名为 Foo 的环境，教程建议我使用类似

class FooEnv(gym.Env)

But I can just as well use但我也可以使用

class FooEnv()

and my environmnent will still work in exactly the same way.我的环境仍然会以完全相同的方式工作。 I have seen one small benefit of using OpenAI Gym: I can initiate different versions of the environment in a cleaner way.我看到了使用 OpenAI Gym 的一个小好处：我可以以更简洁的方式启动不同版本的环境。 But apart from that, can anyone describe or point out any resources on what big advantages the gym.Env superclass provides?但除此之外，任何人都可以描述或指出有关 gym.Env 超类提供哪些大优势的任何资源吗？ I want to make sure I'm making full use of them:) thanks!我想确保我充分利用它们:) 谢谢！

Answer 1

I think it's more of a way to make sure that the community will produce standardized environments, where interactions with the environment object always take place in the same manner.我认为这更多地是确保社区将产生标准化环境的一种方式，其中与环境 object 的交互始终以相同的方式进行。

It would be annoying if every developer used a different terminology for the same underlying concepts.如果每个开发人员对相同的基本概念使用不同的术语，那将会很烦人。 For example, if no one creates a standard to comply with, you could have people defining the step method very differently ( make_step , forward_step , take_action , env_step , etc.), or even giving it a different signature by modifying its parameters for example.例如，如果没有人创建要遵守的标准，您可以让人们以非常不同的方式定义step方法（ make_step 、 forward_step 、 take_action 、 env_step等），或者甚至通过修改其参数给它一个不同的签名。

When you inherit from gym.Env , you make sure that you'll implement needed methods, otherwise you'll get a NotImplementedError , see the source file :当您从gym.Env继承时，请确保您将实现所需的方法，否则您将获得NotImplementedError ，请参阅源文件：

    def step(self, action):
    """Run one timestep of the environment's dynamics. When end of
    episode is reached, you are responsible for calling `reset()`
    to reset this environment's state.

    Accepts an action and returns a tuple (observation, reward, done, info).

    Args:
        action (object): an action provided by the agent

    Returns:
        observation (object): agent's observation of the current environment
        reward (float) : amount of reward returned after previous action
        done (bool): whether the episode has ended, in which case further step() calls will return undefined results
        info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)
    """
    raise NotImplementedError

This way, Reinforcement Learning agents are more easily reused and connected to different environments without needing to modify the code.这样，强化学习代理更容易重用并连接到不同的环境，而无需修改代码。

But overall you're right, you have little "direct" benefits from this parent class, except for the fact that your code will be more reusable, which is important !但总的来说你是对的，你从这个父 class 中几乎没有“直接”的好处，除了你的代码将更可重用，这很重要！

为什么在创建自定义环境时使用 OpenAI Gym 的 Env class 而不是什么都不使用？

问题描述

1 个解决方案

解决方案1
2 已采纳 2020-12-17 15:46:45

为什么在创建自定义环境时使用 OpenAI Gym 的 Env class 而不是什么都不使用？

问题描述

1 个解决方案

解决方案1 2 已采纳 2020-12-17 15:46:45

解决方案1
2 已采纳 2020-12-17 15:46:45