简体繁体 English

多智能体强化学习环境公共交通问题

[英]Multi-agent reinforcement learning environment Public transport problem

原文 2020-09-23 11:15:57 4 1 python/ reinforcement-learning/ openai-gym/ agent-based-modeling/ multi-agent

For my Msc thesis I want to apply multi-agent RL to a bus control problem.对于我的 Msc 论文，我想将多代理 RL 应用于总线控制问题。 The idea is that the busses operate on a given line, but without a timetable.这个想法是公共汽车在给定的线路上运行，但没有时间表。 The busses should have bus stops where passengers accumulate over time and pick them up, the longer the interval between busses, the more passengers will be waiting at the stop (on average, it's a stochastic process).公交车应该有公交车站，乘客随着时间的推移积累并接他们，公交车之间的间隔越长，在车站等待的乘客就越多（平均而言，这是一个随机过程）。 I also want to implement some intersections where busses will have to wait for a green light.我还想实现一些交叉路口，公交车必须等待绿灯。

I'm not sure yet what my reward function will look like, but it will be something along the lines of keeping the intervals between busses as regular as possible or minimising total travel time of the passengers.我还不确定我的奖励函数会是什么样子，但它会尽可能保持公共汽车之间的间隔或最小化乘客的总旅行时间。

The agents in the problem will be the busses, but also the traffic lights.问题中的代理将是公共汽车，但也包括交通灯。 The traffic lights can choose when to show green light for which road: apart from the busses they will have other demand as well that has to be processed.交通信号灯可以选择何时为哪条道路亮绿灯：除了公交车，他们还有其他需要处理的需求。 The busses can choose to speed up, slow down, to wait longer at a stop or to continue on normal speed.公交车可以选择加速、减速、在车站等待更长时间或以正常速度继续行驶。

To be able to put this problem in a RL framework I will need an enviroment and suitable RL algorithms.为了能够将这个问题放在 RL 框架中，我需要一个环境和合适的 RL 算法。 Ideally I would have a flexible simulation environment to re-create my case study bus line and connect this to of-the-shelf RL algorithms.理想情况下，我会有一个灵活的模拟环境来重新创建我的案例研究总线并将其连接到现成的 RL 算法。 However, so far I haven't found this.但是，到目前为止我还没有找到这个。 This means I may have to connect a simulation environment to something like an OpenAI gym myself.这意味着我可能必须自己将模拟环境连接到 OpenAI 健身房之类的东西。

Does anyone have advice for which simulation environment may be suitable?有没有人建议适合哪种模拟环境？ And if it's possible to connect this to of-the-shelf RL algorithms?如果有可能将其连接到现成的 RL 算法？

I feel most comfortable with programming in Python, but other languages are an option as well (but this would mean considerable extra effort from my side).我觉得用 Python 编程最舒服，但其他语言也是一种选择（但这意味着我需要付出相当大的额外努力）。

So far I have found the following simulation environments that may be suitable:到目前为止，我发现了以下可能适合的模拟环境：

NetLogo网络标志
SimPy简单的
Mesa台面
MATSim ( https://www.matsim.org ) MATSim ( https://www.matsim.org )
Matlab MATLAB
CityFlow ( https://cityflow-project.github.io/#about ) CityFlow ( https://cityflow-project.github.io/#about )
Flatland ( https://www.aicrowd.com/challenges/neurips-2020-flatland-challenge/ ) Flatland ( https://www.aicrowd.com/challenges/neurips-2020-flatland-challenge/ )

For the RL algorithms the options seem to be:对于 RL 算法，选项似乎是：

Code them myself自己编码
Create the environment according to the OpenAI gym API guidelines and use the OpenAI baselines algorithms.根据 OpenAI 健身房 API 指南创建环境并使用 OpenAI 基线算法。

I would love to hear some suggestions and advice on which environments may be most suitable for my problem!我很想听听一些关于哪些环境最适合我的问题的建议和建议！