简体繁体中英

When to use Monte Carlo over TD learning, and vice-versa

原文 2019-04-28 16:27:19 4 1 machine-learning/ reinforcement-learning/ montecarlo/ temporal-difference

When studying Reinforcement learning, and exactly when it comes to Model-Free RL, there are two methods we use generally:

TD learning
Monte Carlo

When is each one of them used over the other? In other words, how do we figure out what method is best for our problem?

1 answers

Sections 6.1 and 6.2 of Sutton & Barto give a very nice intuitive understanding of the difference between Monte Carlo and TD learning.

Having said that, there's of course the obvious incompatibility of MC methods with non-episodic tasks. In that case, you will always need some kind of bootstrapping.

Why do Markov chain monte carlo (MCMC) useful in bayesian machine learning?

Simple example of reinforce algorithm (monte-carlo policy gradient)

Monte Carlo Tree Search Tic-Tac-Toe — Poor Agent

Monte Carlo Tree Search in board games - How to Implement Opponent Moves

Monte Carlo dropout doesn't change anything with any rate

Markov Chain Monte Carlo, proposal distribution for multivariate Bernoulli distribution?

How to compute the uncertainty of a Monte Carlo Dropout neural network with PyTorch?

TD learning vs Q learning

When to use learning rate finder

When to use supervised or unsupervised learning?

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Why do Markov chain monte carlo (MCMC) useful in bayesian machine learning? Simple example of reinforce algorithm (monte-carlo policy gradient) Monte Carlo Tree Search Tic-Tac-Toe — Poor Agent Monte Carlo Tree Search in board games - How to Implement Opponent Moves Monte Carlo dropout doesn't change anything with any rate Markov Chain Monte Carlo, proposal distribution for multivariate Bernoulli distribution? How to compute the uncertainty of a Monte Carlo Dropout neural network with PyTorch? TD learning vs Q learning When to use learning rate finder When to use supervised or unsupervised learning?

Related Tags

When to use Monte Carlo over TD learning, and vice-versa

Question

1 answers

solution1 2 2019-05-02 02:00:03

solution1
2 2019-05-02 02:00:03