简体繁体中英

SARSA in Reinforcement Learning

原文 2018-05-15 23:48:13 9 3 algorithm/ machine-learning/ reinforcement-learning

I am coming across the SARSA algorithm in model-free reinforcement learning. Specifically, in each state, you would take an action a , and then observed a new state s' .

My question is, if you don't have the state transition probability equation P{next state | current state = s0} P{next state | current state = s0} , how do you know what your next state will be?

My attempt : do you simply try that action a out, and then observe from the enviroment?

3 answers

通常，是的，您在环境中执行操作，环境会告诉您下一个状态是什么。

Yes. Based on the agent experience, stored in an action-value function, his behavior policy pi maps the current state s in an action a that leads him to a next state s' and then to a next action a' .

Fluxogram of state-action pairs sequences.

A technique called TD-Learning is used in Q-learning and SARSA to avoid learning the transition probabilities.

In short, when you are sampling, ie interacting with the system, and collecting data samples, (state, action, reward, next state, next action), in SARSA, the transition probabilities are implicitly considered when you use samples to update the parameters of your model. For example, every time you choose an action at the current state and then you get a reward and the new state, the system, in fact, generated the reward and the new state according to the transition probability p(s', r| a, s).

You can find a simple description in this book,

Artificial Intelligence A Modern Approach

Training a Neural Network with Reinforcement learning

When to use a certain Reinforcement Learning algorithm?

Reinforcement learning algorithm using turtle graphics not functioning

Fitted value iteration algorithm of Markov Reinforcement Learning

How can I apply reinforcement learning to continuous action spaces?

Reinforcement Learning for Continuous State Spaces with Discrete Actions (in NetLogo)

Learning how to do quicksort

Spaced repetition (SRS) for learning

Keyword association learning algorithm

Machine learning algorithm

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Training a Neural Network with Reinforcement learning When to use a certain Reinforcement Learning algorithm? Reinforcement learning algorithm using turtle graphics not functioning Fitted value iteration algorithm of Markov Reinforcement Learning How can I apply reinforcement learning to continuous action spaces? Reinforcement Learning for Continuous State Spaces with Discrete Actions (in NetLogo) Learning how to do quicksort Spaced repetition (SRS) for learning Keyword association learning algorithm Machine learning algorithm

Related Tags

SARSA in Reinforcement Learning

Question

3 answers

solution1
4 ACCPTED 2018-05-16 00:25:49

solution2
1 2018-07-20 14:58:14

solution3
0 2019-01-17 20:46:40

SARSA in Reinforcement Learning

Question

3 answers

solution1 4 ACCPTED 2018-05-16 00:25:49

solution2 1 2018-07-20 14:58:14

solution3 0 2019-01-17 20:46:40

solution1
4 ACCPTED 2018-05-16 00:25:49

solution2
1 2018-07-20 14:58:14

solution3
0 2019-01-17 20:46:40