简体   繁体   中英

Issue implementing gradient descent

I'm implementing gradient descent from scratch and I've got a segment of code which is giving me trouble.

temp = theta_new[j]
theta_new[j] = theta_new[j] - alpha*deriv
theta_old[j] = temp

It's not changing theta_new[j] . If I print theta_new[j] just after the assignment of theta_new[j] then it gets changed, but somehow the third line in which I assign theta_old[j] reverts theta_new[j] back to initial value. I assume this has something to do with how arrays are referenced, but I can't wrap my head around it.

You should use deep copy to copy objects and create a new memory value and a new pointer :

import copy
theta_old = copy.deepcopy(theta_new)
theta_new[j] = theta_new[j] - alpha*deriv
...

https://docs.python.org/3/library/copy.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM