[英]Understanding gradient of gradient descent algorithm in Numpy
I'm trying to figure out the python code for multivariate gradient descent algorithm, and have found several several implementations like this: 我正在尝试找出用于多元梯度下降算法的python代码,并且发现了几种类似的实现:
import numpy as np
# m denotes the number of examples here, not the number of features
def gradientDescent(x, y, theta, alpha, m, numIterations):
xTrans = x.transpose()
for i in range(0, numIterations):
hypothesis = np.dot(x, theta)
loss = hypothesis - y
cost = np.sum(loss ** 2) / (2 * m)
print("Iteration %d | Cost: %f" % (i, cost))
# avg gradient per example
gradient = np.dot(xTrans, loss) / m
# update
theta = theta - alpha * gradient
return theta
From the definition of gradient descent, the expression of gradient descent is: 根据梯度下降的定义,梯度下降的表达式为:
However, in numpy, it is being calculated as: np.dot(xTrans, loss) / m
Can someone please explain how we have got this numpy expression ? 但是,在numpy中,其计算公式为:
np.dot(xTrans, loss) / m
有人可以解释一下我们如何得到这个numpy表达式吗?
The code is actually very straightforward, it would be beneficial to spend a bit more time to read it. 该代码实际上非常简单,花一些时间阅读它会很有益。
hypothesis - y
is the first part of the square loss' gradient (as a vector form for each component), and this is set to the loss
variable. hypothesis - y
是平方损失梯度的第一部分(作为每个分量的矢量形式),并将其设置为loss
变量。 The calculuation of the hypothesis looks like it's for linear regression. xTrans
is the transpose of x
, so if we dot product these two we get the sum of their components' products. xTrans
是x
的转置,因此如果我们对这两个点进行乘积xTrans
,我们将得到其组件乘积的总和。 m
to get the average. m
除以得到平均值。 Other than that, the code has some python style issues. 除此之外,代码还存在一些python样式问题。 We typically use
under_score
instead of camelCase
in python, so for example the function should be gradient_descent
. 我们通常在python中使用
under_score
而不是camelCase
,因此,例如,该函数应为gradient_descent
。 More legible than java isn't it? 比java更清晰,不是吗? :)
:)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.