简体   繁体   English

代码未收敛香草梯度下降

[英]Code Not Converging Vanilla Gradient Descent

I have a specific analytical gradient I am using to calculate my cost f(x,y), and gradients dx and dy. 我有一个特定的分析梯度,用于计算成本f(x,y)以及梯度dx和dy。 It runs, but I can't tell if my gradient descent is broken. 它可以运行,但是我无法确定我的梯度下降是否中断。 Should I plot my partial derivatives x and y? 我应该画出我的偏导数x和y吗?

import math

gamma = 0.00001 # learning rate
iterations = 10000 #steps
theta = np.array([0,5]) #starting value
thetas = []
costs = []

# calculate cost of any point
def cost(theta):
    x = theta[0]
    y = theta[1]
    return 100*x*math.exp(-0.5*x*x+0.5*x-0.5*y*y-y+math.pi)

def gradient(theta):
    x = theta[0]
    y = theta[1]
    dx = 100*math.exp(-0.5*x*x+0.5*x-0.0035*y*y-y+math.pi)*(1+x*(-x + 0.5))
    dy = 100*x*math.exp(-0.5*x*x+0.5*x-0.05*y*y-y+math.pi)*(-y-1)
    gradients = np.array([dx,dy])
    return gradients

#for 2 features
for step in range(iterations):
    theta = theta - gamma*gradient(theta)
    value = cost(theta)
    thetas.append(theta)
    costs.append(value)

thetas = np.array(thetas)
X = thetas[:,0]
Y = thetas[:,1]
Z = np.array(costs)

iterations = [num for num in range(iterations)]

plt.plot(Z)
plt.xlabel("num. iteration")
plt.ylabel("cost")

I strongly recommend you check whether or not your analytic gradient is working correcly by first evaluating it against a numerical gradient. 我强烈建议您先对数值渐变进行评估,以检查解析渐变是否正常工作。 Ie make sure that your f'(x) = (f(x+h) - f(x)) / h for some small h. 即确保您的f'(x)=(f(x + h)-f(x))/ h很小。

After that, make sure your updates are actually in the right direction by picking a point where you know x or y should decrease and then checking the sign of your gradient function output. 之后,通过选择一个您知道x或y应当减小的点,然后检查梯度函数输出的符号,来确保更新方向正确。

Of course make sure your goal is actually minimization vs maximization. 当然,请确保您的目标实际上是最小化还是最大化。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM