简体   繁体   English

梯度下降不起作用,它返回 theta 值的增量并趋于无穷大

[英]gradient descent not working ,it returns increse in theta values and goes to infinity

Here is my code in python.这是我在 python 中的代码。 I tried with different alpha values and different implementation but I didn't get good theta values and goes to infinity with 0.000000001 learning rate also can you give me any suggestion with this code...我尝试了不同的 alpha 值和不同的实现,但我没有得到好的 theta 值,并且以0.000000001 的学习率达到无穷大,你也可以用这段代码给我任何建议......

    def gradient_descent(x,y):
        theta0=theta1=0
        iteration=10000000
        m=len(x)
        alpha=0.000000001
        for i in range(iteration):
            y_pre=theta1*x+theta0            #equation for y_prediction
            theta1d=-(2/m)*sum(x*(y-y_pre))  #finding derivative
            theta0d=-(2/m)*sum(y-y_pre)
            theta1=theta1+alpha*theta1d
            theta0=theta0+alpha*theta0d
            #cost_func=(sum((y-y_pre)**2)/2m
            print("theta1",theta1,"theta0",theta0)
def gradient_descent(x,y):
    theta0=theta1=0
    iteration=10000000
    m=len(x)
    alpha=0.000000001
    for i in range(iteration):
        y_pre=theta1*x+theta0            #equation for y_prediction
        theta1d=-(1/m)*sum(x*(y_pre - y))  #finding derivative (modified)
        theta0d=-(1/m)*sum(y_pre - y) #modified
        theta1=theta1+alpha*theta1d
        theta0=theta0+alpha*theta0d
        #cost_func=(sum((y_pre - y)**2)/2m   #modified
        print("theta1",theta1,"theta0",theta0)

The equation for calculating theta values is, theta = theta - alpha * partial derivative of the cost function .计算 theta 值的等式是, theta = theta - alpha * partial derivative of the cost function Normally the cost function is defined as cost_func=(sum((y_pre - y)**2)/2m (Note the difference is y_pred-y not y-y_pred ). There for the partial derivative of the cost function contains y_pre - y not y - y_pred . Here is how you should calculate the cost function and theta values. (Try on different alpha values, i didn't modify the alpha or the number of iterations)通常成本 function 定义为cost_func=(sum((y_pre - y)**2)/2m (注意区别是y_pred-y而不是y-y_pred )。对于成本 function 的偏导数包含y_pre - y不是y - y_pred 。这是计算成本 function 和 theta 值的方法。(尝试不同的 alpha 值,我没有修改 alpha 或迭代次数)

在此处输入图像描述

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM