简体   繁体   中英

When to use Monte Carlo over TD learning, and vice-versa

When studying Reinforcement learning, and exactly when it comes to Model-Free RL, there are two methods we use generally:

  • TD learning
  • Monte Carlo

When is each one of them used over the other? In other words, how do we figure out what method is best for our problem?

Sections 6.1 and 6.2 of Sutton & Barto give a very nice intuitive understanding of the difference between Monte Carlo and TD learning.

Having said that, there's of course the obvious incompatibility of MC methods with non-episodic tasks. In that case, you will always need some kind of bootstrapping.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM