簡體 English 中英

強化學習中行動限制的限制

[英]Limit on Action Change in reinforcement learning

原文 2019-03-10 14:09:00 5 1 reinforcement-learning

我想使用DDPG在虛擬環境中建造一艘自主艦 。

但是，問題在於操作的動作空間為（-180'，+180'），DDPG可以選擇（t-1）的-180'和（t + 1）的+180'，在現實世界中這是不可能的。 （基本上，您不能快速旋轉方向盤。）

我認為可能的解決方案是這樣。

設置最大轉向率（例如，每步10'）
如果采取的動作超出可用動作范圍（current_steeringWheel_angle-10'，current_steeringWheel_angle + 10'），則將采取的動作更改為可用動作范圍內的最終值
在虛擬環境中采取已更改的操作。
（第一個選項）使用更改后的操作更新DDPG。
（第二個選項）使用最初執行的操作更新DDPG。

1 個解決方案

我想我找到了解決方案。

第一參考：

（來源： https ： //stats.stackexchange.com/questions/378008/how-to-handle-a-changing-action-space-in-reinforcement-learning/378025#378025？ newreg = 09ef385b87a54f27b5011f983dbf0270 ）

第二參考（基本上，與上面的內容差不多）：

https://stats.stackexchange.com/questions/328835/enforcing-game-rules-in-alpha-go-zero

強化學習中的多維動作空間

[英]Multidimensional Action Space in Reinforcement Learning

在強化學習中將離散動作轉換為連續動作

[英]Transfer Discrete action to Continuous action in Reinforcement Learning

強化學習中連續動作空間的動作掩蔽

[英]Action masking for continuous action space in reinforcement learning

在任意較大的動作/狀態空間中進行強化學習

[英]Reinforcement Learning in arbitrarily large action/state spaces

增強學習以獲取連續的狀態和動作空間

[英]Reinforcement learning for continuous state and action space

強化學習中的狀態依賴動作集

[英]State dependent action set in reinforcement learning

[英]Reinforcement Learning

python 具有連續動作空間的策略梯度強化學習不起作用

[英]python policy gradient reinforcement learning with continous action space is not working

當行動不影響強化學習中的狀態時，這叫什么？

[英]What is it called when the action doesnt affect the state in reinforcement learning?

強化學習 - 代理如何知道要選擇哪個動作？

[英]Reinforcement Learning - How does an Agent know which action to pick?

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 強化學習中的多維動作空間在強化學習中將離散動作轉換為連續動作強化學習中連續動作空間的動作掩蔽在任意較大的動作/狀態空間中進行強化學習增強學習以獲取連續的狀態和動作空間強化學習中的狀態依賴動作集強化學習 python 具有連續動作空間的策略梯度強化學習不起作用當行動不影響強化學習中的狀態時，這叫什么？強化學習 - 代理如何知道要選擇哪個動作？

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM