简体   繁体   English

Tensorflow 2.x 代理(TF-代理,强化学习模块)和 PySC2

[英]Tensorflow 2.x Agents(TF-Agents, Reinforcement Learning Module) & PySC2

There are pysc2( https://github.com/deepmind/pysc2 ) & Tensorflow(1.x) and OpenAI-Baselines( https://github.com/openai/baselines ), like the following有pysc2( https://github.com/deepmind/pysc2 ) & Tensorflow(1.x) 和OpenAI-Baselines( https://github.com/openai/baselines ),像下面这样

https://github.com/chris-chris/pysc2-examples
https://github.com/llSourcell/A-Guide-to-DeepMinds-StarCraft-AI-Environment

The TF team has recently come up with a RL implementations(alternative to OpenAi-Baselines) called TF-Agents ( https://github.com/tensorflow/agents ). TF 团队最近提出了一个称为 TF-Agents ( https://github.com/tensorflow/agents ) 的 RL 实现(替代 OpenAi-Baselines)。 Examples:例子:

https://github.com/tensorflow/agents/blob/master/docs/tutorials/1_dqn_tutorial.ipynb
https://github.com/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_12_05_apply_rl.ipynb
https://github.com/jeffheaton/t81_558_deep_learning/blob/master/t81_558_class_12_04_atari.ipynb

For TF-Agents, you do对于 TF-Agents,你做

env_name = 'CartPole-v0'
train_py_env = suite_gym.load(env_name)
eval_py_env = suite_gym.load(env_name)

q_net = q_network.QNetwork(
    train_env.observation_spec(),
    train_env.action_spec(),
    fc_layer_params=fc_layer_params)

optimizer = tf.compat.v1.train.AdamOptimizer(learning_rate=learning_rate)

agent = dqn_agent.DqnAgent(
    train_env.time_step_spec(),
    train_env.action_spec(),
    q_network=q_net,
    optimizer=optimizer,
    td_errors_loss_fn=common.element_wise_squared_loss,
    train_step_counter=train_step_counter)
agent.initialize()

For pysc2,对于 pysc2,

from pysc2.env import environment
from pysc2.env import sc2_env
from pysc2.lib import actions
from pysc2.lib import actions as sc2_actions
from pysc2.lib import features
mineral_env = sc2_env.SC2Env(
        map_name="CollectMineralShards",
        step_mul=step_mul,
        agent_interface_format=AGENT_INTERFACE_FORMAT,
        visualize=True)

How do I combine TF-Agents and Pysc2 together?如何将 TF-Agents 和 Pysc2 组合在一起? They are both Google products.它们都是谷歌产品。

I've recently stumbled on a very similar situation where I wanted to use the hanabi-learning-environment developed by DeepMind with TF-Agents.我最近偶然发现了一个非常相似的情况,我想将 DeepMind 开发的 hanabi-learning-environment 与 TF-Agents 一起使用。 I'm afraid I have to tell that there is no nice solution to this.恐怕我不得不说对此没有很好的解决方案。

What you must do is fork the DeepMind repo and modify the environment wrapper to be compatible with what TF-Agents requires.您必须做的是分叉 DeepMind 存储库并修改环境包装器以与 TF-Agents 的要求兼容。 It's gonna be quite some work to do especially if you are not familiar with how environments are defined in TF-Agents, but this is def.netly something that can be done in about a week of work.这将是相当多的工作要做,特别是如果你不熟悉 TF-Agents 中环境的定义方式,但这是 def.netly 可以在大约一周的工作中完成的事情。

If you want to get an idea of what I did you can look at the original rl_env.py code in the Hanabi repo from DeepMind, and what I modified it into in my repo如果你想了解我做了什么,你可以查看 DeepMind 的 Hanabi 存储库中的原始rl_env.py代码,以及我在我的存储中将其修改成的内容

I have no idea why DeepMind stick to their structure instead of making their code more compatible, but this is how it is.我不知道为什么 DeepMind 坚持他们的结构而不是让他们的代码更兼容,但事实就是如此。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM