梯度下降MATLAB腳本

Question

因此，我編寫了以下MATLAB代碼作為梯度下降的練習。 我顯然選擇了一個最小值為（0,0）的函數，但是算法將我扔到了（-3,3）。

我確實發現yGrad上在xGrad和yGrad之間進行切換： [xGrad,yGrad] = gradient(f); 盡管xGrad ， yGrad約為2*X ， 2*Y ，但仍可以提供正確的收斂。 我想我在這里倒了點什么，但是一段時間以來我一直在試圖弄清楚到底是什么，但我不明白，所以我希望有人能注意到我的錯誤...

dx=.01;
dy=.01;
x=-3:dx:3;
y=-3:dy:3;
[X,Y]=meshgrid(x,y);

f=X.^2+Y.^2;

lr = .1; %learning rate
eps = 1e-10; %epsilon threshold
tooMuch = 1e5; %limit iterations
p = [.1 1]; %starting point
[~, idx] = min( abs(x-p(1)) ); %index of closest value
[~, idy] = min( abs(y-p(2)) ); %index of closest value
p = [x(idx) y(idy)]; %closest point to start
[xGrad,yGrad] = gradient(f); %partial derivatives of f
xGrad = xGrad/dx; %scale correction
yGrad = yGrad/dy; %scale correction

for i=1:tooMuch %prevents too many iterations    
    fGrad = [ xGrad(idx,idy) , yGrad(idx,idy) ]; %gradient's definition
    pTMP = p(end,:) - lr*fGrad; %gradient descent's core
    [~, idx] = min( abs(x-pTMP(1)) ); %index of closest value
    [~, idy] = min( abs(y-pTMP(2)) ); %index of closest value
    p = [p;x(idx) y(idy)]; %add the new point
    if sqrt( sum( (p(end,:)-p(end-1,:)).^2 ) ) < eps %check conversion
        break
    end
end

感謝任何幫助

編輯：更正了錯別字並使代碼更加清晰。 它仍然做同樣的事情並且有同樣的問題

Answer 1

meshgrid返回的X矩陣的X值在列而不是行中遞增！ 例如[X, Y] = meshgrid(-1:1, 1:3)返回

     [-1  0  1;           [1  1  1;
X  =  -1  0  1;       Y =  2  2  2;
   =  -1  0  1];           3  3  3];

請注意如何將x-index放置在X或Y的列中，並將y-index放置在該行中。 具體來說，您的行：

fGrad = [ xGrad(idx,idy) , yGrad(idx,idy) ]; %gradient's definition

相反，應為：

fGrad = [ xGrad(idy,idx) , yGrad(idy,idx) ]; %gradient's definition

idy變量應索引行，而idx變量應索引列

Answer 2

最終，我沒有弄清楚前一種方法出了什么問題，但是這里有一個用於漸變色樣的替代腳本，我用它來解決同樣的問題：

syms x y
f = -20*(x/2-x^2-y^5)*exp(-x^2-y^2); %cost function
% f = x^2+y^2; %simple test function

g = gradient(f, [x, y]);
lr = .01; %learning rate
eps = 1e-10; %convergence threshold
tooMuch = 1e3; %iterations' limit
p = [1.5 -1]; %starting point
for i=1:tooMuch %prevents too many iterations
    pGrad = [subs(g(1),[x y],p(end,:)) subs(g(2),[x y],p(end,:))]; %computes gradient
    pTMP = p(end,:) - lr*pGrad; %gradient descent's core
    p = [p;double(pTMP)]; %adds the new point
    if sum( (p(end,:)-p(end-1,:)).^2 ) < eps %checks convergence
        break
    end
end
v = -3:.1:3; %desired axes
[X, Y] = meshgrid(v,v);
contour(v,v,subs(f,[x y],{X,Y})) %draws the contour lines 
hold on
quiver(v,v,subs(g(1), [x y], {X,Y}),subs(g(2), [x y], {X,Y})) %draws the gradient directions 
plot(p(:,1),p(:,2)) %draws the route
hold off
suptitle(['gradient descent route from ',mat2str(round(p(1,:),3)),' with \eta=',num2str(lr)])
if i<tooMuch
    title(['converged to ',mat2str(round(p(end,:),3)),' after ',mat2str(i),' steps'])
else
    title(['stopped at ',mat2str(round(p(end,:),3)),' without converging'])
end

只是一些結果

在后一種情況下，您可以看到它沒有收斂，但是梯度下降並不是問題，只是學習率設置得太高（因此反復錯過了最小點）。

歡迎使用它。

梯度下降MATLAB腳本

問題描述

2 個解決方案

解決方案1
1 2015-11-20 20:55:23

解決方案2
1 已采納 2015-11-21 11:06:01

梯度下降MATLAB腳本

問題描述

2 個解決方案

解決方案1 1 2015-11-20 20:55:23

解決方案2 1 已采納 2015-11-21 11:06:01

解決方案1
1 2015-11-20 20:55:23

解決方案2
1 已采納 2015-11-21 11:06:01