简体   繁体   English


[英]Linear Regression - Unexpected Results (Python)

I am getting unexpected results for the implementation of linear regression I coded. 对于我编写的线性回归,我得到了意外的结果。 Sometimes I get out of memory error, squaring errors, multiplication errors, basically that I've run out of size. 有时我会遇到内存错误,平方错误,乘法错误,基本上是我用完了。

The code seems pretty okay to me, and I'm unable to identify why it fails to work. 该代码对我来说似乎还不错,而且我无法确定为什么它无法正常工作。

X = np.array([ 6.1101,  5.5277,  8.5186,  7.0032,  5.8598,  8.3829, 7.4764,
        8.5781,  6.4862,  5.0546,  5.7107, 14.164 ,  5.734 ,  8.4084,
        5.6407,  5.3794,  6.3654,  5.1301,  6.4296,  7.0708,  6.1891,
       20.27  ,  5.4901,  6.3261,  5.5649, 18.945 , 12.828 , 10.957 ,
       13.176 , 22.203 ,  5.2524,  6.5894,  9.2482,  5.8918,  8.2111,
        7.9334,  8.0959,  5.6063, 12.836 ,  6.3534,  5.4069,  6.8825,
       11.708 ,  5.7737,  7.8247,  7.0931,  5.0702,  5.8014, 11.7   ,
        5.5416,  7.5402,  5.3077,  7.4239,  7.6031,  6.3328,  6.3589,
        6.2742,  5.6397,  9.3102,  9.4536,  8.8254,  5.1793, 21.279 ,
       14.908 , 18.959 ,  7.2182,  8.2951, 10.236 ,  5.4994, 20.341 ,
       10.136 ,  7.3345,  6.0062,  7.2259,  5.0269,  6.5479,  7.5386,
        5.0365, 10.274 ,  5.1077,  5.7292,  5.1884,  6.3557,  9.7687,
        6.5159,  8.5172,  9.1802,  6.002 ,  5.5204,  5.0594,  5.7077,
        7.6366,  5.8707,  5.3054,  8.2934, 13.394 ,  5.4369])
y = np.array([17.592  ,  9.1302 , 13.662  , 11.854  ,  6.8233 , 11.886  ,
        4.3483 , 12.     ,  6.5987 ,  3.8166 ,  3.2522 , 15.505  ,
        3.1551 ,  7.2258 ,  0.71618,  3.5129 ,  5.3048 ,  0.56077,
        3.6518 ,  5.3893 ,  3.1386 , 21.767  ,  4.263  ,  5.1875 ,
        3.0825 , 22.638  , 13.501  ,  7.0467 , 14.692  , 24.147  ,
       -1.22   ,  5.9966 , 12.134  ,  1.8495 ,  6.5426 ,  4.5623 ,
        4.1164 ,  3.3928 , 10.117  ,  5.4974 ,  0.55657,  3.9115 ,
        5.3854 ,  2.4406 ,  6.7318 ,  1.0463 ,  5.1337 ,  1.844  ,
        8.0043 ,  1.0179 ,  6.7504 ,  1.8396 ,  4.2885 ,  4.9981 ,
        1.4233 , -1.4211 ,  2.4756 ,  4.6042 ,  3.9624 ,  5.4141 ,
        5.1694 , -0.74279, 17.929  , 12.054  , 17.054  ,  4.8852 ,
        5.7442 ,  7.7754 ,  1.0173 , 20.992  ,  6.6799 ,  4.0259 ,
        1.2784 ,  3.3411 , -2.6807 ,  0.29678,  3.8845 ,  5.7014 ,
        6.7526 ,  2.0576 ,  0.47953,  0.20421,  0.67861,  7.5435 ,
        5.3436 ,  4.2415 ,  6.7981 ,  0.92695,  0.152  ,  2.8214 ,
        1.8451 ,  4.2959 ,  7.2029 ,  1.9869 ,  0.14454,  9.0551 ,
theta = np.array([0,0]) #Initial values of theta_0 and theta_1

#Calculates cost function for given theta
def costFunction(X,y,theta):
    m = y.size
    hypothesis = (X * theta[1]) + theta[0]
    return (1/m) * sum((hypothesis - y) ** 2)

#Calculates the partial derivatives of theta_0 and theta_1
def slope(X,y,theta):
    hypothesis = (X * theta[1]) + theta[0]
    theta_0 = 2/(m) * sum(hypothesis - y) 
    theta_1 = 2/(m) * sum((hypothesis - y) * X)
    return np.array([theta_0,theta_1])

#running the gradient descent with 200 iters with learning rate 0.1
for i in range(200):
    theta = theta - 0.1*slope(X,y,theta)

costFunction(X,y,theta) # Prints inf

Your learning rate is too large and GD does not converge. 您的学习率太大,GD无法收敛。 Try changing it to 0.01 and run it for more epochs, it worked for me. 尝试将其更改为0.01并运行更多的时间,它对我有用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM