[英]FrozenLake Q-Learning Update Issue
I'm learning Q-Learning and trying to build a Q-learner on the FrozenLake-v0 problem in OpenAI Gym. 我正在学习Q学习,并尝试在OpenAI Gym中针对FrozenLake-v0问题构建Q学习器。 Since the problem has only 16 states and 4 possible actions it should be fairly easy, but looks like my algorithm is not updating the Q-table correctly.
由于问题只有16个状态和4个可能的动作,因此应该很容易,但是看起来我的算法没有正确更新Q表。
The following is my Q-learning algorithm: 以下是我的Q学习算法:
import gym
import numpy as np
from gym import wrappers
def run(
env,
Qtable,
N_STEPS=10000,
alpha=0.2, # 1-alpha the learning rate
rar=0.4, # random exploration rate
radr=0.97 # decay rate
):
# Initialize pars::
TOTAL_REWARD = 0
done = False
action = env.action_space.sample()
state = env.reset()
for _ in range(N_STEPS):
if done:
print('TW', TOTAL_REWARD)
break
s_prime, reward, done, info = env.step(action)
# Update Q Table:
Qtable[state, action] = (1 - alpha) * Qtable[state, action] + alpha * (reward + Qtable[s_prime,np.argmax(Qtable[s_prime,])])
# Prepare for the next step:
# Next New Action:
if rand.uniform(0, 1) < rar:
action = env.action_space.sample()
else:
action = np.argmax(Qtable[s_prime, :])
# Update new state:
state = s_prime
# Update Decay:
rar *= radr
# Update Stats
TOTAL_REWARD += reward
if reward > 0:
print(reward)
return Qtable, TOTAL_REWARD
Then run the Q-learner 1000 iterations: 然后运行Q-learner 1000次迭代:
if __name__=="__main__":
# Required Pars:
N_ITER = 1000
REWARDS = []
# Setup the Maze:
env = gym.make('FrozenLake-v0')
# Initialize Qtable:
num_actions = env.unwrapped.nA
num_states = env.unwrapped.nS
# Qtable = np.random.uniform(0, 1, size=num_states * num_actions).reshape((num_states, num_actions))
Qtable = np.zeros((env.observation_space.n, env.action_space.n))
for _ in range(N_ITER):
res = run(env, Qtable)
Qtable = res[0]
REWARDS.append(res[1])
print(np.mean(REWARDS))
Any advice will be appreciated! 任何建议将被认真考虑!
Here you can find an example that works for this problem, in two ways: 在这里,您可以找到以两种方式解决此问题的示例:
https://medium.com/emergent-future/simple-reinforcement-learning-with-tensorflow-part-0-q-learning-with-tables-and-neural-networks-d195264329d0 https://medium.com/emergent-future/simple-reinforcement-learning-with-tensorflow-part-0-q-learning-with-tables-and-neural-networks-d195264329d0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.