简体   繁体   English

从梯度下降更新规则中获得直觉

[英]Gaining intuition from gradient descent update rule

Gradient descent update rule : 梯度下降更新规则:

在此输入图像描述

Using these values for this rule : 将此值用于此规则:

x = [10 20 30 40 50 60 70 80 90 100] y = [4 7 8 4 5 6 7 5 3 4] x = [10 20 30 40 50 60 70 80 90 100] y = [4 7 8 4 5 6 7 5 3 4]

After two iterations using a learning rate of 0.07 outputs a value theta of 在使用0.07的学习速率的两次迭代之后输出值θ

-73.396
-5150.803

After three iterations theta is : 经过三次迭代后,theta是:

1.9763e+04
   1.3833e+06

So it appears theta gets larger after the second iteration which suggests the learning rate is too large. 所以看来theta在第二次迭代后变得更大,这表明学习率太大了。

So I set : 所以我设置:

iterations = 300; 迭代= 300; alpha = 0.000007; alpha = 0.000007;

theta is now : theta现在是:

 0.0038504
 0.0713561

Should these theta values allow me to draw a straight line the data, if so how ? 这些theta值是否应该允许我绘制一条直线数据,如果是这样的话? I've just begun trying to understand gradient descent so please point out any errors in my logic. 我刚开始尝试了解梯度下降所以请指出我的逻辑中的任何错误。

source : 资源 :

x = [10
    20
    30
    40
    50
    60
    70
    80
    90
    100]
y = [4
    7
    8
    4
    5
    6
    7
    5
    3
    4]

m = length(y)

x = [ones(m , 1) , x]

theta = zeros(2, 1);        

iterations = 300;
alpha = 0.000007;

for iter = 1:iterations
     theta = theta - ((1/m) * ((x * theta) - y)' * x)' * alpha;
     theta
end

plot(x, y, 'o');
ylabel('Response Time')
xlabel('Time since 0')

Update : 更新:

So the product for each x value multiplied by theta plots a straight line : 因此,每个x值的乘积乘以theta绘制一条直线:

plot(x(:,2), x*theta, '-')

在此输入图像描述

Update 2 : 更新2:

How does this relate to the linear regression model : 这与线性回归模型有何关系:

在此输入图像描述

As the model also outputs a prediction value ? 由于模型还输出预测值?

Yes, you should be able to draw a straight line. 是的,你应该能够画一条直线。 In regression, gradient descent is an algorithm used to minimize the cost(error) function of your linear regression model. 在回归中,梯度下降是一种用于最小化线性回归模型的成本(误差)函数的算法。 You use the gradient as a track to travel to the minimum of your cost function and the learning rate determines how quickly you travel down the path. 您可以使用渐变作为曲目来移动到成本函数的最小值,学习速率决定了您沿着路径行进的速度。 Go too fast and you might pass the global minimum up. 走得太快,你可能会通过全球最低限度。 When you reached the desired minimum, plug those values of theta into your model to obtain your estimated model. 达到所需的最小值后,将这些θ值插入模型中以获得估计的模型。 In the one dimensional case, this is a straight line. 在一维情况下,这是一条直线。

Check out this article , which gives a nice introduction to gradient descent. 看看这篇文章 ,它给出了梯度下降的一个很好的介绍。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM