简体繁体 English

Octave / MatLab中矩阵的梯度函数

[英]Gradient function on a matrix in Octave/MatLab

原文 2017-12-02 15:45:47 6 1 matlab/ neural-network/ octave/ gradient/ gradient-descent

I'm trying to implement the gradient descent algorithm in Octave/Matlab. 我正在尝试在Octave / Matlab中实现梯度下降算法。 I'm at the point where I have this 201x201 matrix called errors , which I would assume corresponds to a 2 input variables function f(x, y) . 我正好有一个名为errors 201x201矩阵，我假设它对应于2个输入变量函数f(x, y) 。 The matrix gives a nice gradient image when displayed with imagesc , but I am confused as to when I calculate [dx, dy] = gradient(errors) . 当与imagesc显示时，矩阵给出了一个很好的渐变图像，但是当我计算[dx, dy] = gradient(errors)时，我感到困惑。 I obtain both dx and dy to be 2 dimensional matrices (201x201) instead of simple vectors. 我获得dx和dy均为二维矩阵（201x201），而不是简单的矢量。 I would assume that, since we calculate the partial derivative in regards to x (resp. y), y (resp. x) so it would disappear from the result of the operation. 我假设，因为我们计算的是关于x（分别为y），y（分别为x）的偏导数，所以它将从运算结果中消失。 I'm pretty sure I'm missing something, although I feel like I have a good enough understanding of how the gradient of a function works. 尽管我觉得我对函数梯度的工作原理有足够的了解，但是我很确定我会丢失一些东西。 Thank you in advance for you answer. 预先感谢您的答复。

1 个解决方案

The gradient exists at a point. 渐变存在于一个点。 Your gradient expression is evaluating the (numerical) gradient at all 201x201 points. 您的gradient表达式正在评估所有201x201点的（数字）渐变。

So for example, the gradient of errors at the point (3,4) is the vector [dx(3,4), dy(3,4)] . 因此，例如，在点(3,4)处的errors梯度为向量[dx(3,4), dy(3,4)] 。

This example might help: https://www.mathworks.com/help/matlab/ref/gradient.html#bvhqkfr Notice how the information returned by gradient is enough to plot the whole vector field of gradients. 此示例可能会有所帮助： https : //www.mathworks.com/help/matlab/ref/gradient.html#bvhqkfr请注意， gradient返回的信息如何足以绘制gradient的整个矢量场。