简体   繁体   English

使用张量计算错误值的梯度下降

[英]gradient descent using tensors calculating wrong values

I am implementing simple gradient descent algorithm using tensors. 我正在使用张量实现简单的梯度下降算法。 It learns two parameters m and c . 它学习了两个参数mc
The normal python code for it is : 它的正常python代码是:

for i in range(epochs): 
    Y_pred = m*X + c  # The current predicted value of Y
    D_m = (-2/n) * sum(X * (Y - Y_pred))  # Derivative wrt m
    D_c = (-2/n) * sum(Y - Y_pred)  # Derivative wrt c
    m = m - L * D_m  # Update m
    c = c - L * D_c  # Update c
    print (m, c) 

output for python : python的输出:

0.7424335285442664 0.014629895049575754
1.1126970531591416 0.021962519495058154
1.2973530613155333 0.025655870599552183
1.3894434413955663 0.027534253868790198
1.4353697670010162 0.028507481513901086

Tensorflow equivalent code : Tensorflow等效代码:

#Graph of gradient descent
y_pred = m*x + c
d_m = (-2/n) * tf.reduce_sum(x*(y-y_pred)) 
d_c = (-2/n) * tf.reduce_sum(y-y_pred)  
upm = tf.assign(m, m - learning_rate * d_m)
upc = tf.assign(c, c - learning_rate * d_c)

#starting session
sess = tf.Session()

#Training for epochs
for i in range(epochs):
    sess.run(y_pred)
    sess.run(d_m)
    sess.run(d_c)
    sess.run(upm)
    sess.run(upc)
    w = sess.run(m)
    b = sess.run(c)
    print(w,b)

Output for tensorflow : 张量流的输出:

0.7424335285442664 0.007335550424492317
1.1127687194584988 0.011031122807663662
1.2974962163433057 0.012911024540805463
1.3896400798226038 0.013885244876397126
1.4356019721347115 0.014407698787092268

The parameter m has the same value for both but parameter c has different value for both although the implementation is same for both. 参数m对于两者具有相同的值但是参数c对于两者具有不同的值 ,尽管两者的实现相同。
The output contains first 5 values of parameter m and c. 输出包含参数m和c的前5个值。 The output of parameter c using tensors is approximately half of the normal python. 使用张量的参数c的输出大约是普通python的一半。
I don't know where my mistake is. 我不知道我的错误在哪里。

For recreating the entire output: Repo containing data along with both implementations 用于重新创建整个输出: Repo包含数据以及两个实现

The repo also contains image of graph obtained through tensorboard in events directory repo还包含通过events目录中的tensorboard获得的图形图像

The problem is that, in the TF implementation, the updates are not being performed atomically. 问题是,在TF实现中,更新不是以原子方式执行的。 In other words, the implementation of the algorithm is updating m and c in an interleaved manner (eg the new value of m is being used when updating c ). 换句话说,算法的实现是以交织方式更新mc (例如,在更新c时使用m的新值)。 To make the updates atomic, you should simultaneously run upm and upc : 要使更新成为原子,您应该同时运行upmupc

sess.run([upm, upc])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM