简体   繁体   English

了解Numpy中的梯度下降算法的梯度

[英]Understanding gradient of gradient descent algorithm in Numpy

I'm trying to figure out the python code for multivariate gradient descent algorithm, and have found several several implementations like this: 我正在尝试找出用于多元梯度下降算法的python代码,并且发现了几种类似的实现:

import numpy as np

# m denotes the number of examples here, not the number of features
def gradientDescent(x, y, theta, alpha, m, numIterations):
    xTrans = x.transpose()
    for i in range(0, numIterations):
        hypothesis = np.dot(x, theta)
        loss = hypothesis - y
        cost = np.sum(loss ** 2) / (2 * m)
        print("Iteration %d | Cost: %f" % (i, cost))
        # avg gradient per example
        gradient = np.dot(xTrans, loss) / m
        # update
        theta = theta - alpha * gradient
    return theta

From the definition of gradient descent, the expression of gradient descent is: 根据梯度下降的定义,梯度​​下降的表达式为: [\\压裂{1} {M} \\ {sum_ I = 1} ^ M(H_ \\ THETA(X ^ {(I)}) -  Y 1 {(I)})x_j ^ {(I)}]

However, in numpy, it is being calculated as: np.dot(xTrans, loss) / m Can someone please explain how we have got this numpy expression ? 但是,在numpy中,其计算公式为: np.dot(xTrans, loss) / m有人可以解释一下我们如何得到这个numpy表达式吗?

The code is actually very straightforward, it would be beneficial to spend a bit more time to read it. 该代码实际上非常简单,花一些时间阅读它会很有益。

  • hypothesis - y is the first part of the square loss' gradient (as a vector form for each component), and this is set to the loss variable. hypothesis - y是平方损失梯度的第一部分(作为每个分量的矢量形式),并将其设置为loss变量。 The calculuation of the hypothesis looks like it's for linear regression. 假设的计算看起来像是线性回归。
  • xTrans is the transpose of x , so if we dot product these two we get the sum of their components' products. xTransx的转置,因此如果我们对这两个点进行乘积xTrans ,我们将得到其组件乘积的总和。
  • we then divide by m to get the average. 然后,我们用m除以得到平均值。

Other than that, the code has some python style issues. 除此之外,代码还存在一些python样式问题。 We typically use under_score instead of camelCase in python, so for example the function should be gradient_descent . 我们通常在python中使用under_score而不是camelCase ,因此,例如,该函数应为gradient_descent More legible than java isn't it? 比java更清晰,不是吗? :) :)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM