I'm a beginner data science student and I was asked to code a linear regression from scratch, including a gradient descent, following the teacher's instructions (eg which functions to implement) and using numpy.
The whole thing was working ok, but when using the gradient_descent() function I keep getting an Overflow Error down to the gradient() function when I sum all elements of a vector to compute the gradient regarding on estimator. The weird thing is that the gradient() function works pretty well on its own, but overflows in the gradient_descent() function.
I tried to round up the intermediate results as to not overflow whatever was overflowing, I tried to isolate every result. I'm using 3.7.3 on MacOs 10.14.6 with jupyter.
Here is my code:
import numpy as np
import random
def predict(x,th):
if x.shape[1] != th.shape[0]:
return "ERROR : The number of covariable columns is not equal to number of lines in parameter matrix !"
else :
return (x@th)
def error(x,th,y):
return (y-predict(x,th))
def gradient(x,th,y):
grad = np.full(th.shape[0],1)
for i in range(grad.shape[0]):
err = error(x,th,y).transpose()
temp = x[:,i]*err
grad[i] = temp.sum()
return grad
def gradient_descent(x,th,y,a = 0.01):
i = 0
while i<2000:
dif = a*gradient(x,th,y)
th = th - dif
i += 1
if dif.all()<0.5:
break
return th
th = np.full(13,1).reshape(13,1) #just for testing purposes
predict(x_train, th)
error(x_train, th, y_train).shape
cost_fun(x_train, th, y_train)
gradient_descent(x_train, th, y_train)
And the error that comes with it:
---------------------------------------------------------------------------
OverflowError Traceback (most recent call last)
<ipython-input-414-5e608e709a9e> in <module>
4 error(x_train, th, y_train).shape
5 cost_fun(x_train, th, y_train)
----> 6 gradient_descent(x_train, th, y_train)
7
8
<ipython-input-413-57df3054d402> in gradient_descent(x, th, y, a)
25 i = 0
26 while i<2000:
---> 27 dif = a*gradient(x,th,y)
28 th = th - dif
29 i += 1
<ipython-input-413-57df3054d402> in gradient(x, th, y)
19 err = error(x,th,y)[i]
20 temp = x[i,:]*err
---> 21 grad[i] = round(temp.sum(), ndigits=10)
22 return grad
23
OverflowError: Python int too large to convert to C long
When I run gradient(x_train,th,y_train)
i get this:
array([ -98761915, -398968695, -1128435471, -1089578372, -7619613, -54698832, -620945173, -6731108064, -378298899, -932523483, -40174412843, -1826831673, 34647602295])
The gradient_descent() should return a vector of optimised parameters. What could possibly be wrong?!
Hi just trace your error by printing the loop no and values of th and dif every time before u execute the statement dif = a*gradient(x,th,y) then check the last values after u encounter the error. I don't x_train and y_train so I can't run the code. if possible share link of part of data so that I can look.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.