I wrote a machine learning algorithm in python using tensorflow. The algorithm pseudo-code can be seen in the figure below. In this algorithm I'm using sess.run() more than one time in the training loops. The reason I have to use more than one sess.run() is because I have to evaluate the same neural network at different inputs to calculate δ. For some reason that I still don't know my code is extremely slow (see codereview , ai to see the code and related questions).
Figure taken from the book Reinforcement Learning An Introduction by Richard S. Sutton and Andrew G. Barto.
My questions for this stack are the following:
to do,
sess.run([op1],feed_dict={input:data})
sess.run([op2],feed_dict={input:data})
instead of,
sess.run([op1,op2],feed_dict={input:data})
is there any difference at all?
I'm currently calculating δ as follows:
self.delta = self.time_step_info['r'] + (not self.time_step_info['d'])*self.gamma*sess.run(self.critic(),feed_dict={self.state_in:self.time_step_info['s1']}) - sess.run(self.critic(),feed_dict={self.state_in:self.time_step_info['s']})
For your firsst question, I'm not sure.
But for your second question, as you may already know, the input should be a matrix. A matrix can contain multiple X
. And NN will generate a corresponding result matrix Y
, each line of this matrix Y
is the output of line in X
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.