[英]How to define action space in custom gym environment that receives 3 scalers and a matrix each turn?
for a personal project, I need to define a custom gym environment that runs a certain board game.对于个人项目,我需要定义一个运行特定棋盘游戏的自定义健身房环境。 each turn of the game, the environment takes the state of the board as a matrix of ones and zeros, and an action - described as a tuple:
游戏的每一轮,环境都将棋盘的状态作为一个由 1 和 0 组成的矩阵,以及一个动作——描述为一个元组:
(integer, integer, small matrix) (整数,整数,小矩阵)
from reading online, I know that a gym env should take the shape:通过在线阅读,我知道健身房环境应该是这样的:
class CustomEnv(gym.Env):
"""Custom Environment that follows gym interface"""
metadata = {'render.modes': ['human']}
def __init__(self, arg1, arg2, ...):
super(CustomEnv, self).__init__()
self.action_space =
self.observation_space =
def step(self, action):
...
def reset(self):
...
def render(self, mode='human', close=False):
now, I feel like the action input here does not exactly fall into "discrete" or "continuous" - how should I implement the action part of the init function and the step function?现在,我觉得这里的动作输入并不完全属于“离散”或“连续”——我应该如何实现 init 函数和 step 函数的动作部分?
Defining your action space in the init function is fairly straight forward using gym's Tuple space:使用gym的元组空间在init函数中定义你的动作空间是相当简单的:
from gym import spaces
space = spaces.Tuple((
spaces.Discrete(5),
spaces.Discrete(4),
spaces.Box(low=0, high=1, shape=(2, 2))))
The Discrete space represents a range of integers and the Box space to represents a n-dimensional array. Discrete 空间表示整数范围,Box 空间表示 n 维数组。 You can print a sample of your space to get an idea of what it looks like:
您可以打印您的空间样本以了解它的外观:
print(space.sample())
>>> (3, 1, array([[0.20318432, 0.26787955], [0.5323673 , 0.6564413 ]], dtype=float32))
For the step function, you just need to interact with your environment based on the input action, which will be formatted just like the sample.对于 step 函数,您只需要根据输入操作与您的环境进行交互,其格式将与示例一样。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.