简体   繁体   English

这个梯度下降有什么问题?

[英]What's wrong with this gradient descent?

I want to minimize ||Ah(Wx)-y|| 我想最小化||Ah(Wx)-y|| by gradient descent where h us ReLU 通过梯度下降,其中h我们RELU

s = 9
n = 99
m = 999
A = np.random.normal((n,m))
y = np.random.normal((m,1))
W = np.random.normal((n,s))

def obj_fcn(x):
    return np.linalg.norm(A.dot(np.max(W.dot(x),0))-y)

if __name__ == '__main__':
    dx = 10
    max_iter = 99
    i = 0
    x = np.zeros((s,1))
    ddx = 0.00001
    ddx2 = ddx*2
    while i<max_iter:
        d_obj = (obj_fcn(x+ddx)-obj_fcn(x-ddx))/ddx2
        x = x + d_obj
        print(x)
        i = i+1

This gives error 这给了错误

/Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6 /Users/Chu/Documents/fun/m608.py
Traceback (most recent call last):
  File "/Users/Chu/Documents/fun/m608.py", line 21, in <module>
    d_obj = (obj_fcn(x+ddx)-obj_fcn(x-ddx))/ddx2
  File "/Users/Chu/Documents/fun/m608.py", line 11, in obj_fcn
    return np.linalg.norm(A.dot(np.max(W.dot(x),0))-y)
ValueError: shapes (2,) and (9,1) not aligned: 2 (dim 0) != 9 (dim 0)

What's wrong? 怎么了? Is there an book I can read about such issues in Python? 我是否可以阅读有关Python中此类问题的书? I almost always have this kind of trouble. 我几乎总是有这种麻烦。

I broke down your function to individual operations and printed the inputs and results as soon as computed. 我将您的功能分解为单个操作,并在计算后立即打印输入和结果。 This is a very basic debugging technique; 这是非常基本的调试技术。 you need to learn this skill if you're going to program at any level beyond trivial applications. 如果要在普通应用程序之外的任何级别进行编程,则需要学习此技能。 See this lovely debug blog for help. 请参阅这个可爱的调试博客以获取帮助。

def obj_fcn(x):
    print("x", x)
    print("W", W)
    wdot = W.dot(x)
    print("wdot", wdot)
    wdot = np.max(wdot, 0)
    adot = A.dot(wdot)
    print("adot", adot)
    norm_arg = adot - y   
    print("arg", norm_arg)
    result = np.linalg.norm(norm_arg)
    return result
    # return np.linalg.norm(A.dot(np.max(W.dot(x),0))-y)

Output: 输出:

x [[  1.00000000e-05]
 [  1.00000000e-05]
 [  1.00000000e-05]
 [  1.00000000e-05]
 [  1.00000000e-05]
 [  1.00000000e-05]
 [  1.00000000e-05]
 [  1.00000000e-05]
 [  1.00000000e-05]]
W [ 98.90267125   7.01706833]
Traceback (most recent call last):
  File "so.py", line 32, in <module>
    d_obj = (obj_fcn(x+ddx) - obj_fcn(x-ddx)) / ddx2
  File "so.py", line 13, in obj_fcn
    wdot = W.dot(x)
ValueError: shapes (2,) and (9,1) not aligned: 2 (dim 0) != 9 (dim 0)

That shows the error clearly: you're trying to take the dot product of items of different lengths. 这清楚地显示了错误:您正在尝试获取不同长度项目的点积。 Dot product is not defined for such a case. 在这种情况下未定义点积。 You have to fix your input processing. 您必须修复输入处理。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM