简体   繁体   English

OpenAI gym:什么时候需要重置?

[英]OpenAI gym: when is reset required?

Although I can manage to get the examples and my own code to run, I am more curious about the real semantics / expectations behind OpenAI gym API, in particular Env.reset()虽然我可以设法让示例和我自己的代码运行,但我更好奇 OpenAI gym API 背后的真实语义/期望,特别是 Env.reset()

When is reset expected/required?什么时候需要/需要重置? At the end of each episode?每集结尾? Or only after creating an environment?还是只有在创建环境之后?

I rather think it makes sense before each episode but I have not been able to read that explicitly!我宁愿在每一集之前认为它是有道理的,但我无法明确地阅读它!

You typically use reset after an entire episode.您通常在整个剧集后使用重置。 So that could be after you reached a terminal state in the mdp, or after you reached you maximum amount of time steps (set by you).所以这可能是在您达到 mdp 中的终端状态之后,或者在您达到最大时间步长(由您设置)之后。 I also typically reset it at the very start of training as well.我通常也会在训练开始时重置它。

So if you are at your starting state 'A' and you want to reach state 'Z', you would run your time steps going from 'A' -> 'B' -> 'C' ..., then when you reach the terminal state 'Z', you start a new episode using reset, which would take you back to 'A'.因此,如果您处于起始状态 'A' 并且想要到达状态 'Z',您将运行从 'A' -> 'B' -> 'C' ... 开始的时间步骤,然后当您到达时终端状态“Z”,您使用重置开始新剧集,这将带您回到“A”。

    for episode in range(iterations):
        state = env.reset() // first state
        for time_step in range(1000):  //max amount of iterations
            action = take_action(state)
            state, reward, done, _ = env.step(action)
            if done:
                break // takes you to the next episode where the environment is reset

Thing simply by using env.reset() it just reset whole things so you need to reset each episode只需使用env.reset()即可重置所有内容,因此您需要重置每一集

环境重置()

This is example for reset function inside a custom environment.这是在自定义环境中重置 function 的示例 It just reset the enemy position and time in this case在这种情况下,它只是重置敌人 position 和时间

I guess you got better understanding by showing what is inside environment我猜你通过展示内部环境得到了更好的理解

Sorry for late response抱歉迟了回应

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM