简体繁体 English

利用强化学习进行多智能体连续空间路径寻找的最佳算法

[英]Best algorithm for multi agent continuous space path finding using Reinforcement learning

原文 2019-06-24 05:05:08 5 1 deep-learning/ artificial-intelligence/ pytorch/ reinforcement-learning/ multi-agent

I am working on project in which I need to find best optimised path from 1 point to another in continuous space in multi agent scenario. 我正在开发一个项目，我需要在多代理方案中的连续空间中找到从1点到另一点的最佳优化路径。 I am looking for best algorithm which suits this problem using Reinforcement learning. 我正在寻找使用强化学习来解决这个问题的最佳算法。 I have tried "Multi-agent actor-critic for mixed cooperative-competitive environment" but it does not seems to reach goals in 10000 epesidoes. 我曾尝试过“混合合作竞争环境的多智能体演员评论家”，但它似乎没有达到10000个epesidoes的目标。 How can I improve this algorithm or is there any other algorithm that can help me with this. 如何改进此算法，或者是否有任何其他算法可以帮助我解决这个问题。

1 个解决方案

Multi-agent reinforcement learning is quite hard to master and has yet to prove effective for general cases. 多智能体强化学习很难掌握，并且尚未证明对一般病例有效。

The problem is that in multi-agent the environment becomes non-stationary from the perspective of each individual agent. 问题在于，在多代理中，从每个代理的角度来看，环境变得不稳定。 This means that an agents action cannot be mapped to the state directly because other agents are performing action seperately, which "confuse" all of the agents. 这意味着代理操作无法直接映射到状态，因为其他代理正在单独执行操作，这会“混淆”所有代理。 There is an in-depth collection of multi-agent research here: https://github.com/LantaoYu/MARL-Papers 这里有一个深入的多智能体研究集合： https ： //github.com/LantaoYu/MARL-Papers

If you would like you to pursue the actor-critic method you mentioned, I recommend this for you further research: https://arxiv.org/pdf/1706.02275.pdf if you would like to perfect Multi-Agent Actor Critic (MADDPG) 如果你希望你追求你提到的演员评论方法，我建议你进一步研究： https ：//arxiv.org/pdf/1706.02275.pdf如果你想完善Multi-Agent Actor Critic （MADDPG）