簡體 English 中英

TFAgents：如何考慮無效操作

[英]TFAgents: how to take into account invalid actions

原文 2020-12-08 16:08:05 1 1 tensorflow/ reinforcement-learning/ tensorflow-agents

我正在使用 TF-Agents 庫進行強化學習，並且我想考慮到，對於給定的 state，某些操作是無效的。

如何實施？

創建 DqnAgent 時是否應該定義“observation_and_action_constraint_splitter”function？

如果是的話：你知道這方面的任何教程嗎？

1 個解決方案

是的，您需要定義 function，將其傳遞給代理並適當更改環境 output 以便 function 可以使用它。 我不知道有關這方面的任何教程，但是您可以查看我一直在處理的這個repo。

請注意，它非常混亂，其中的許多文件實際上都沒有被使用，而且文檔字符串很糟糕而且經常是錯誤的（我分叉了這個並且沒有費心整理所有內容）。 但是，它肯定可以正常工作。 與您的問題相關的部分是：

rl_env.py HanabiEnv.__init__ ，其中_observation_spec被定義為ArraySpecs的字典（此處）。 您可以忽略用於詳細運行環境的game_obs 、 hand_obs和knowledge_obs ，它們不會提供給代理。
第 110 行的rl_env.py中的HanabiEnv._reset給出了如何構造時間步長觀測值並從環境返回的概念。 legal_moves通過np.logical_not傳遞，因為我的特定環境將 legal_moves 標記為 0，將非法的標記為 -inf； 而 TF-Agents 期望 1/True 表示合法移動。 因此，當轉換為 bool 時，我的向量將導致與 TF-agents 應該完全相反的結果。
然后這些觀察將被饋送到utility.py （ here ）中的observation_and_action_constraint_splitter ，其中返回一個包含觀察和動作約束的元組。 請注意， game_obs 、 hand_obs和knowledge_obs被隱式丟棄（而不是像前面提到的那樣饋給代理。
最后，例如第 198 行的create_agent function 中的utility.py中的observation_and_action_constraint_splitter被提供給代理。

使用 TFagents 的自定義環境

[英]Custom environment using TFagents

pytorch 中的 tfagents 是否有任何替代品

[英]Is there any alternate for tfagents in pytorch

Tensorflow / Deepmind：如何從與證據相關的數學算法的觀察中采取行動？

[英]Tensorflow / Deepmind: how do I take actions from observations for math algorithms related to proofs?

有沒有辦法讓 GraphSAGE 考慮加權邊

[英]Is there a way to allow GraphSAGE take into account weighted edges

如何考慮一個熱向量的結果桶與正確桶之間的差異？

[英]How to take the difference between the resulting and the correct bucket of a one hot vector into account?

如何在 tensorflow 自定義訓練循環中考慮 l1 和 l2 正則化器？

[英]How do I take l1 and l2 regularizers into account in tensorflow custom training loops?

tfagents sequential.network 中 Conv1d 輸入形狀的問題

[英]Problem with input shape of Conv1d in tfagents sequential network

TFAGENTS：關於 DqnAgent 代理的 observation_and_action_constraint_splitter 用法的說明

[英]TFAGENTS: clarification on the usage of observation_and_action_constraint_splitter for DqnAgent agents

Keras：如何為驗證集抽取隨機樣本？

[英]Keras: How to take random samples for validation set?

如何在 keras 中取輸入並行層

[英]How to take input parallel layers in keras

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 使用 TFagents 的自定義環境 pytorch 中的 tfagents 是否有任何替代品 Tensorflow / Deepmind：如何從與證據相關的數學算法的觀察中采取行動？有沒有辦法讓 GraphSAGE 考慮加權邊如何考慮一個熱向量的結果桶與正確桶之間的差異？如何在 tensorflow 自定義訓練循環中考慮 l1 和 l2 正則化器？ tfagents sequential.network 中 Conv1d 輸入形狀的問題 TFAGENTS：關於 DqnAgent 代理的 observation_and_action_constraint_splitter 用法的說明 Keras：如何為驗證集抽取隨機樣本？如何在 keras 中取輸入並行層

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM