简体   繁体   中英

Agent repeats the same action circle non stop, Q learning

How can you prevent the agent from non-stop repeating the same action circle?

Of course, somehow with changes in the reward system. But are there general rules you could follow or try to include in your code to prevent such a problem?


To be more precise, my actual problem is this one:

I'm trying to teach an ANN to learn Doodle Jump using Q-Learning. After only a few generations the agent keeps jumping on one and the same platform/stone over and over again, non-stop. It doesn't help to increase the length of the random-exploration-time.

My reward system is the following:

  • +1 when the agent is living
  • +2 when the agent jumps on a platform
  • -1000 when it dies

An idea would be to reward it negative or at least with 0 when the agent hits the same platform as it did before. But to do so, I'd have to pass a lot of new input-parameters to the ANN: x,y coordinates of the agent and x,y coordinates of the last visited platform.

Furthermore, the ANN then would also have to learn that a platform is 4 blocks thick, and so on.

Therefore, I'm sure that this idea I just mentioned wouldn't solve the problem, contrarily I believe that the ANN would in general simply not learn well anymore, because there are too many unuseful and complex-to-understand inputs.

This is not a direct answer to the very generally asked question.


I found a workaround for my particular DoodleJump example, probably someone does something similar and needs help:

  • While training: Let every platform the agent jumped on disappear after that, and spawn a new one somewhere else.

  • While testing/presenting: You can disable the new "disappear-feature" (so that it's like it was before again) and the player will play well and won't hop on one and the same platform all the time.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM