简体   繁体   中英

I keep getting a OverflowError in Python 3 using numpy but only in a function (outside the same code is fine)

I'm a beginner data science student and I was asked to code a linear regression from scratch, including a gradient descent, following the teacher's instructions (eg which functions to implement) and using numpy.

The whole thing was working ok, but when using the gradient_descent() function I keep getting an Overflow Error down to the gradient() function when I sum all elements of a vector to compute the gradient regarding on estimator. The weird thing is that the gradient() function works pretty well on its own, but overflows in the gradient_descent() function.

I tried to round up the intermediate results as to not overflow whatever was overflowing, I tried to isolate every result. I'm using 3.7.3 on MacOs 10.14.6 with jupyter.

Here is my code:

import numpy as np
import random

def predict(x,th):
    if x.shape[1] != th.shape[0]:
        return "ERROR : The number of covariable columns is not equal to number of lines in parameter matrix !"
    else :
        return (x@th)

def error(x,th,y):
    return (y-predict(x,th))

def gradient(x,th,y):
    grad = np.full(th.shape[0],1)
    for i in range(grad.shape[0]):
        err = error(x,th,y).transpose()
        temp = x[:,i]*err 
        grad[i] = temp.sum()
    return grad

def gradient_descent(x,th,y,a = 0.01):
    i = 0
    while i<2000:
        dif = a*gradient(x,th,y)
        th = th - dif
        i += 1
        if dif.all()<0.5:
            break
    return th

th = np.full(13,1).reshape(13,1) #just for testing purposes

predict(x_train, th)
error(x_train, th, y_train).shape
cost_fun(x_train, th, y_train)
gradient_descent(x_train, th, y_train)

And the error that comes with it:

---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
<ipython-input-414-5e608e709a9e> in <module>
      4 error(x_train, th, y_train).shape
      5 cost_fun(x_train, th, y_train)
----> 6 gradient_descent(x_train, th, y_train)
      7 
      8 

<ipython-input-413-57df3054d402> in gradient_descent(x, th, y, a)
     25     i = 0
     26     while i<2000:
---> 27         dif = a*gradient(x,th,y)
     28         th = th - dif
     29         i += 1

<ipython-input-413-57df3054d402> in gradient(x, th, y)
     19         err = error(x,th,y)[i]
     20         temp = x[i,:]*err
---> 21         grad[i] = round(temp.sum(), ndigits=10)
     22     return grad
     23 

OverflowError: Python int too large to convert to C long

When I run gradient(x_train,th,y_train) i get this:

array([ -98761915, -398968695, -1128435471, -1089578372, -7619613, -54698832, -620945173, -6731108064, -378298899, -932523483, -40174412843, -1826831673, 34647602295])

The gradient_descent() should return a vector of optimised parameters. What could possibly be wrong?!

Hi just trace your error by printing the loop no and values of th and dif every time before u execute the statement dif = a*gradient(x,th,y) then check the last values after u encounter the error. I don't x_train and y_train so I can't run the code. if possible share link of part of data so that I can look.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM