Issue implementing gradient descent

Question

I'm implementing gradient descent from scratch and I've got a segment of code which is giving me trouble.

temp = theta_new[j]
theta_new[j] = theta_new[j] - alpha*deriv
theta_old[j] = temp

It's not changing theta_new[j] . If I print theta_new[j] just after the assignment of theta_new[j] then it gets changed, but somehow the third line in which I assign theta_old[j] reverts theta_new[j] back to initial value. I assume this has something to do with how arrays are referenced, but I can't wrap my head around it.

Answer 1

You should use deep copy to copy objects and create a new memory value and a new pointer :

import copy
theta_old = copy.deepcopy(theta_new)
theta_new[j] = theta_new[j] - alpha*deriv
...

https://docs.python.org/3/library/copy.html

Issue implementing gradient descent

Question

1 answers

solution1
-1 2017-10-04 02:00:04

Issue implementing gradient descent

Question

1 answers

solution1 -1 2017-10-04 02:00:04

solution1
-1 2017-10-04 02:00:04