简体繁体中英

Agent repeats the same action circle non stop, Q learning

原文 2020-04-22 14:02:22 1 1 python/ tensorflow/ reinforcement-learning/ q-learning

How can you prevent the agent from non-stop repeating the same action circle?

Of course, somehow with changes in the reward system. But are there general rules you could follow or try to include in your code to prevent such a problem?

To be more precise, my actual problem is this one:

I'm trying to teach an ANN to learn Doodle Jump using Q-Learning. After only a few generations the agent keeps jumping on one and the same platform/stone over and over again, non-stop. It doesn't help to increase the length of the random-exploration-time.

My reward system is the following:

+1 when the agent is living
+2 when the agent jumps on a platform
-1000 when it dies

An idea would be to reward it negative or at least with 0 when the agent hits the same platform as it did before. But to do so, I'd have to pass a lot of new input-parameters to the ANN: x,y coordinates of the agent and x,y coordinates of the last visited platform.

Furthermore, the ANN then would also have to learn that a platform is 4 blocks thick, and so on.

Therefore, I'm sure that this idea I just mentioned wouldn't solve the problem, contrarily I believe that the ANN would in general simply not learn well anymore, because there are too many unuseful and complex-to-understand inputs.

1 answers

This is not a direct answer to the very generally asked question.

I found a workaround for my particular DoodleJump example, probably someone does something similar and needs help:

While training: Let every platform the agent jumped on disappear after that, and spawn a new one somewhere else.
While testing/presenting: You can disable the new "disappear-feature" (so that it's like it was before again) and the player will play well and won't hop on one and the same platform all the time.

q agent is learning not to take any actions

Python pyglet repeats playing the audio non stop

Enhancement of Agent Training Q Learning Taxi V3

Python command repeats same number in command(im learning python)

hypothesis repeats the same values

Trying to stop repeats in matchup generator

Q-learning model not improving

Deep Q Learning For Snake Game

Deep Q-learning modification

while loop repeats once when it should stop

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question q agent is learning not to take any actions Python pyglet repeats playing the audio non stop Enhancement of Agent Training Q Learning Taxi V3 Python command repeats same number in command(im learning python) hypothesis repeats the same values Trying to stop repeats in matchup generator Q-learning model not improving Deep Q Learning For Snake Game Deep Q-learning modification while loop repeats once when it should stop

Related Tags

Agent repeats the same action circle non stop, Q learning

Question

1 answers

solution1 0 ACCPTED 2020-04-25 15:27:56

solution1
0 ACCPTED 2020-04-25 15:27:56