簡體 English 中英

模型與政策強化學習有什么區別

[英]What is the difference between model and policy w.r.t reinforcement learning

原文 2019-07-27 10:34:33 2 1 model/ reinforcement-learning/ policy/ mdp

這兩個定義似乎都說明它們是從狀態映射到動作的，那么有什么區別？我錯了嗎？

1 個解決方案

這篇文章真的為您總結了一下：
什么是基於模型的強化學習？

建模還是不建模

“模型”是機器學習（以及更廣泛的科學學科）中經常使用的那些術語之一，通常對我們的意思有相對模糊的解釋。 幸運的是，在強化學習中，模型具有非常特殊的含義：模型指的是環境的不同動態狀態以及這些狀態如何導致獎勵。

...該策略是您用來根據當前狀態/位置確定要采取的操作/方向的任何策略。

強化學習（或任何真正的學習）的總體結果是制定一項策略，即針對特定領域提出的一系列行為或行動。

增強因素是您可以根據先前的學習結果不斷地重新運行學習過程，有效地應用新策略並從結果中學習以改進策略。

在基於模型的強化學習中，我們使用模型來表示環境或領域，該文檔記錄了事實或狀態以及可能采取的措施。 通過了解某些事實，策略可以專門針對每個重復周期中的這些狀態和動作，測試和提高策略的准確性，就像它可以提高模型的質量一樣。

看這兩者的另一種方式是，該模型是先前學習的記錄或結果，它是環境的更新視圖。 該模型基於過去的策略執行結果處理事實或假定的事實，該模型保存了過去執行的記錄，該數據可用於估計從特定狀態采取某些措施的結果。 該政策是關於行為的實際學習，而作為模型的事實是支持並確認我們學習的事實。

同一篇文章中的此圖簡化了強化學習中的模型與策略之間的關系：

深度學習中量化的浮點 16 位和 8 位有什么區別 Model

[英]What is the difference between Floating point 16 and 8 bit quantized in Deep Learning Model

模型和實體有什么區別

[英]what is difference between a Model and an Entity

圖表和model有什么區別

[英]What is the difference between a diagram and a model

cakephp - 模型和行為有什么區別？

[英]cakephp - what is the difference between model and behavior?

模型和算法之間的確切區別是什么？

[英]What is the exact difference between a model and an algorithm?

小內存模型和大內存模型有什么區別？

[英]what is the difference between small memory model and large memory model?

Ember.js：對象模型和模型之間有什么區別？

[英]Ember.js: what's the difference between the object model and model?

C＃LINQ：Pull模型和Push模型有什么區別？

[英]C# LINQ: What is the difference between a Pull model and a Push model?

"Keras 中的 model.fit() 和 model.evaluate() 有什么區別？"

[英]What is the difference between model.fit() an model.evaluate() in Keras?

軟件開發中的模型和圖表之間的概念差異是什么？

[英]What is the conceptual difference between a Model and a Diagram in software development

暫無

暫無

聲明:本站的技術帖子網頁，遵循CC BY-SA 4.0協議，如果您需要轉載，請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

相關問題 深度學習中量化的浮點 16 位和 8 位有什么區別 Model 模型和實體有什么區別圖表和model有什么區別 cakephp - 模型和行為有什么區別？模型和算法之間的確切區別是什么？小內存模型和大內存模型有什么區別？ Ember.js：對象模型和模型之間有什么區別？ C＃LINQ：Pull模型和Push模型有什么區別？ "Keras 中的 model.fit() 和 model.evaluate() 有什么區別？" 軟件開發中的模型和圖表之間的概念差異是什么？

相關標簽

粵ICP備18138465號 © 2020-2024 STACKOOM.COM