简体   繁体   English

在强化学习中将离散动作转换为连续动作

[英]Transfer Discrete action to Continuous action in Reinforcement Learning

In reinforcement learning, we empirically know using discrete actions is easier to train than using continuous actions.在强化学习中,我们凭经验知道使用离散动作比使用连续动作更容易训练。

But theoretically, continuous actions is more accurate and fast, just like our human, most of our actions are continuous.但理论上,连续动作更准确、更快速,就像我们人类一样,我们的大部分动作都是连续的。

So is there any method or related research that train a discrete action policy for easier start and then transfer that policy to output continuous actions for better precision?那么是否有任何方法或相关研究可以训练离散动作策略以更容易启动,然后将该策略转移到输出连续动作以获得更好的精度?

Thanks.谢谢。

You can certainly do that, any papers that does continuous control using reinforcement learning will do this.你当然可以这样做,任何使用强化学习进行连续控制的论文都会这样做。 The only ones that don't are the researchers that use deep reinforcement learning or reinforcement learning with function approximation.唯一没有使用深度强化学习或函数逼近强化学习的研究人员。 My research is applying both reinforcement learning and deep reinforcement learning on dynamical systems.我的研究是在动态系统上应用强化学习和深度强化学习。 I discretize my state and action space to adequate resolution, and then apply it to control problems.我将我的状态和动作空间离散化到足够的分辨率,然后将其应用于控制问题。

I am currently working on some methods to make the discretized system work for continuous spaces.我目前正在研究一些使离散系统适用于连续空间的方法。 One method is to use linear interpolation.一种方法是使用线性插值。 If your state falls between 2 discretized points, you can use linear interpolation to identify the optimal action (in the continuous space).如果您的状态介于 2 个离散点之间,您可以使用线性插值来确定最佳动作(在连续空间中)。 It works especially well for linear system since the control law is linear as follows:它特别适用于线性系统,因为控制律是线性的,如下所示:

u = Kx u = Kx

And this method is directly in line to what you ask: training on a discrete space, and then applying it to a continuous control problem.这种方法直接符合您的要求:在离散空间上训练,然后将其应用于连续控制问题。

However, traditionally, continuous control problems are solved using either linear function approximation such as tile coding, or non-linear function approximation such as artificial neural networks.然而,传统上,连续控制问题是使用线性函数逼近(如瓦片编码)或非线性函数逼近(如人工神经网络)来解决的。 These methods are more advanced, I would suggest trying to use more basic discrete RL methods first.这些方法更高级,我建议先尝试使用更基本的离散 RL 方法。 I have a RL code on my Github you can use, let me know if you have any issues.我的Github上有一个 RL 代码,您可以使用,如果您有任何问题,请告诉我。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 增强学习以获取连续的状态和动作空间 - Reinforcement learning for continuous state and action space 如何将强化学习应用于连续动作空间? - How can I apply reinforcement learning to continuous action spaces? 用于连续状态,离散动作的强化学习算法 - Reinforcement learning algorithms for continuous states, discrete actions 强化学习:为连续动作和连续状态空间选择离散化步骤和性能指标的困境 - Reinforcement Learning: The dilemma of choosing discretization steps and performance metrics for continuous action and continuous state space 在任意较大的动作/状态空间中进行强化学习 - Reinforcement Learning in arbitrarily large action/state spaces 强化学习中的状态依赖动作集 - State dependent action set in reinforcement learning 当行动不影响强化学习中的状态时,这叫什么? - What is it called when the action doesnt affect the state in reinforcement learning? 强化学习 - 代理如何知道要选择哪个动作? - Reinforcement Learning - How does an Agent know which action to pick? 深度强化学习-如何应对动作空间中的界限 - Deep reinforcement learning - how to deal with boundaries in action space 强化学习方法,将连续映射到连续映射 - Reinforcement learning methodes that map continuous to continuous
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM