简体繁体中英

What's the point of using Temporal difference learning at all?

原文 2017-11-26 07:58:15 8 1 reinforcement-learning/ temporal-difference

As far as I know, for a specific policy \\pi, temporal difference learning let us compute the expected value following that policy \\pi, but what's the meaning of knowing a specific policy?

Shouldn't we try finding the optimal policy for a given environment? What's the point of doing a specific \\pi using temporal difference learning at all?

1 answers

As you said, only finding the value function for a given policy is not very useful in the general case, where the goal is finding an optimal policy. However, several classical algorithms such as SARSA or Q-learning , can ve viewed as a special case of generalized policy iteration , where the most difficult part is finding the value function of a policy. Once you know the value function, it's easy to find a better policy, then find again the value function of the recently computed policy, and so on. This process, given some conditions, converges to the optimal policy.

In summary, temporal difference learning is a key step in other algorthims that allow to find an optimal policy.

Updates in Temporal Difference Learning

Neural Network and Temporal Difference Learning

Temporal Difference Learning and Back-propagation

Q-learning vs temporal-difference vs model-based reinforcement learning

What is the difference between reinforcement learning and deep RL?

Update Rule in Temporal difference

What is the difference between Q-learning and SARSA?

What is the difference between Q-learning and Value Iteration?

Training an agent as a motor's controller using reinforcement learning in Matlab, but it doesn't train at all?

Gradient Temporal Difference Lambda without Function Approximation

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Updates in Temporal Difference Learning Neural Network and Temporal Difference Learning Temporal Difference Learning and Back-propagation Q-learning vs temporal-difference vs model-based reinforcement learning What is the difference between reinforcement learning and deep RL? Update Rule in Temporal difference What is the difference between Q-learning and SARSA? What is the difference between Q-learning and Value Iteration? Training an agent as a motor's controller using reinforcement learning in Matlab, but it doesn't train at all? Gradient Temporal Difference Lambda without Function Approximation

Related Tags

What's the point of using Temporal difference learning at all?

Question

1 answers

solution1 2 ACCPTED 2017-11-26 18:49:32

solution1
2 ACCPTED 2017-11-26 18:49:32