簡體   English   中英

CartPole-v0的意外觀察空間

[英]Unexpected observation space for CartPole-v0

我對CartPole-v0內省的觀察空間感到驚訝。

根據這里的官方文件 ,我應該得到什么:

在此輸入圖像描述

然而,這就是我得到的:

print(env.observation_space.low)
print(env.observation_space.high)
#[-4.8000002e+00 -3.4028235e+38 -4.1887903e-01 -3.4028235e+38]
#[4.8000002e+00 3.4028235e+38 4.1887903e-01 3.4028235e+38]

我正在使用最新版的gym

!pip list|grep gym
gym                 0.12.1   

知道發生了什么事嗎?

正如代碼中所記錄的那樣,您似乎正在獲得預期的行為,這有點令人困惑。 一方面,對於cart position ,觀察空間為[-4.8,4.8],但實際上,當推車達到極限[-2.4,2.4]時,情節應該完成。 pole angle的情況類似。

class CartPoleEnv(gym.Env):
"""
Description:
    A pole is attached by an un-actuated joint to a cart, which moves along a frictionless track. The pendulum starts upright, and the goal is to prevent it from falling over by increasing and reducing the cart's velocity.

Source:
    This environment corresponds to the version of the cart-pole problem described by Barto, Sutton, and Anderson

Observation: 
    Type: Box(4)
    Num Observation                 Min         Max
    0   Cart Position             -4.8            4.8
    1   Cart Velocity             -Inf            Inf
    2   Pole Angle                 -24 deg        24 deg
    3   Pole Velocity At Tip      -Inf            Inf

Actions:
    Type: Discrete(2)
    Num Action
    0   Push cart to the left
    1   Push cart to the right

    Note: The amount the velocity that is reduced or increased is not fixed; it depends on the angle the pole is pointing. This is because the center of gravity of the pole increases the amount of energy needed to move the cart underneath it

Reward:
    Reward is 1 for every step taken, including the termination step

Starting State:
    All observations are assigned a uniform random value in [-0.05..0.05]

Episode Termination:
    Pole Angle is more than 12 degrees
    Cart Position is more than 2.4 (center of the cart reaches the edge of the display)
    Episode length is greater than 200
    Solved Requirements
    Considered solved when the average reward is greater than or equal to 195.0 over 100 consecutive trials.
"""

此鏈接中,您可以閱讀相關的Github問題。

*請注意,24度相當於4.1887903e-01弧度。

看起來像過時的文檔,已經創建了一個問題: https//github.com/openai/gym/issues/368

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM