使用梯度下降算法进行线性回归，获得意外结果

Question

I'm trying to create a function which returns the value of θ ₀ & θ ₁ of hypothesis function of linear regression. 我试图创建返回的值的函数θ ₀ _{θ 1}线性回归的假设的功能。 But I'm getting different results for different initial (random) values of θ ₀ & θ ₁ . 但我得到的不同的初始（随机）的值不同的结果θ ₀ θ ₁

What's wrong in the code? 代码有什么问题？

training_data_set = [[1, 1], [2, 3], [4, 3], [3, 2], [5, 5]]
initial_theta = [1, 0]


def gradient_descent(data, theta0, theta1):
    def h(x, theta0, theta1):
        return theta0 + theta1 * x

    m = len(data)
    alpha = 0.01

    for n in range(m):
        cost = 0
        for i in range(m):
            cost += (h(data[i][0], theta0, theta1) - data[i][1])**2

        cost = cost/(2*m)

        error = 0
        for i in range(m):
            error += h(data[i][0], theta0, theta1) - data[i][1]

        theta0 -= alpha*error/m
        theta1 -= alpha*error*data[n][0]/m

    return theta0, theta1


for i in range(5):
    initial_theta = gradient_descent(training_data_set, initial_theta[0], initial_theta[1])


final_theta0 = initial_theta[0]
final_theta1 = initial_theta[1]

print(f'theta0 = {final_theta0}\ntheta1 = {final_theta1}')

Output: 输出：

When initial_theta = [0, 0]

theta0 = 0.27311526522692103
theta1 = 0.7771301328221445


When initial_theta = [1, 1]

theta0 = 0.8829506006170339
theta1 = 0.6669442287905096

Answer 1

Convergence 收敛

You've run five iterations of gradient descent over just 5 training samples with a (probably reasonable) learning rate of 0.01. 您仅对5个训练样本进行了5次梯度下降的迭代，学习率（可能是合理的）为0.01。 That is not expected to bring you to a "final" answer of your problem - you'd need to do many iterations of gradient descent just like you implemented, repeating the process until your thetas converge to a stable value. 这样做不会给您带来问题的“最终”答案-您需要像实施时一样进行多次梯度下降迭代，重复此过程直到theta收敛到稳定值为止。 Then it'd make sense to compare the resulting values. 然后比较结果值是有意义的。

Replace the 5 in for i in range(5) with 5000 and then look at what happens. 将for i in range(5)的5 in替换for i in range(5) 5000，然后看看会发生什么。 It might be illustrative to plot the decrease of the error rate / cost function to see how fast the process converges to a solution. 绘制错误率/成本函数的降低，以查看流程收敛到解决方案的速度可能是说明性的。

Answer 2

This is not a problem rather a very usual thing. 这不是问题，而是很平常的事情。 For that you need to understand how gradient decent works. 为此，您需要了解渐变体面的工作原理。 Every time you randomly initialise your parameters the hypothesis starts it's journey from a random place. 每次您随机初始化参数时，假设都会从一个随机的地方开始。 With every iteration it updates the parameters so that the cost function converges. 每次迭代都会更新参数，从而使成本函数收敛。 In your case as you have ran your gradient decent just for 5 iteration, for different initialisation it ends up with too much different results. 在您的情况下，您仅运行了5次迭代就满足了梯度要求，对于不同的初始化，最终会产生太多不同的结果。 Try higher iterations you will see significant similarity even with different initialisation. 尝试更高的迭代，即使使用不同的初始化，您也会看到明显的相似性。 If i could use visualisation that would be helpful for you. 如果我可以使用可视化功能，那将对您有所帮助。

Answer 3

Here is how I see gradient descent: imagine that you are high up on a rocky mountainside in the fog. 这是我看到梯度下降的方式：想象您高高在雾蒙蒙的山腰上。 Because of the fog, you cannot see the fastest path down the mountain. 由于有雾，您看不到下山的最快路径。 So, you look around your feet and go down based on what you see nearby. 因此，您环顾四周，然后根据附近的视线向下走。 After taking a step, you look around your feet again, and take another step. 迈出一步之后，您再次环顾四周，然后迈出另一步。 Sometimes this will trap you in a small low spot where you cannot see any way down (a local minimum) and sometimes this will get you safely to the bottom of the mountain (global minimum). 有时，这会将您困在一个很小的低点，在那里您看不到任何下落（局部最小值），有时这会使您安全地到达山底（全局最小值）。 Starting from different random locations on the foggy mountainside might trap you in different local minima, though you might find your way down safely if the random starting location is good. 从有雾的山腰上的不同随机位置开始可能会使您陷入不同的局部最小值，但是如果随机起始位置很好，您可能会安全找到自己的路。

使用梯度下降算法进行线性回归，获得意外结果

问题描述

3 个解决方案

解决方案1
2 已采纳 2019-03-18 16:02:15

Convergence 收敛

解决方案2
1 2019-03-18 16:12:51

解决方案3
1 2019-03-18 20:01:59

使用梯度下降算法进行线性回归，获得意外结果

问题描述

3 个解决方案

解决方案1 2 已采纳 2019-03-18 16:02:15

Convergence 收敛

解决方案2 1 2019-03-18 16:12:51

解决方案3 1 2019-03-18 20:01:59

解决方案1
2 已采纳 2019-03-18 16:02:15

解决方案2
1 2019-03-18 16:12:51

解决方案3
1 2019-03-18 20:01:59