[英]Gradient Descent algorithm taking long time to complete - Efficiency - Python
I am trying to implement the gradient descent algorithm using python and following is my code, 我正在尝试使用python实现梯度下降算法,以下是我的代码,
def grad_des(xvalues, yvalues, R=0.01, epsilon = 0.0001, MaxIterations=1000):
xvalues= np.array(xvalues)
yvalues = np.array(yvalues)
length = len(xvalues)
alpha = 1
beta = 1
converged = False
i=0
cost = sum([(alpha + beta*xvalues[i] - yvalues[i])**2 for i in range(length)]) / (2 * length)
start_time = time.time()
while not converged:
alpha_deriv = sum([(alpha + beta*xvalues[i] - yvalues[i]) for i in range(length)]) / (length)
beta_deriv = sum([(alpha + beta*xvalues[i] - yvalues[i])*xvalues[i] for i in range(length)]) / (length)
alpha = alpha - R * alpha_deriv
beta = beta - R * beta_deriv
new_cost = sum( [ (alpha + beta*xvalues[i] - yvalues[i])**2 for i in range(length)] ) / (2*length)
if abs(cost - new_cost) <= epsilon:
print 'Converged'
print 'Number of Iterations:', i
converged = True
cost = new_cost
i = i + 1
if i == MaxIterations:
print 'Maximum Iterations Exceeded'
converged = True
print "Time taken: " + str(round(time.time() - start_time,2)) + " seconds"
return alpha, beta
This code is working fine. 这段代码工作正常。 But the problem is, it is taking more than 25 seconds for approximately for 600 iterations.
但是问题是,大约需要600多次迭代才能花费超过25秒的时间。 I feel this is not efficient enough and I tried converting it to a array before doing the calculations.
我觉得这样做效率不高,因此我尝试在进行计算之前将其转换为数组。 That did reduce the time from 300 to 25 seconds.
确实将时间从300秒减少到25秒。 Still I feel it can be reduced.
我仍然认为它可以减少。 Can anybody help me in improving this algorithm?
有人可以帮助我改进此算法吗?
Thanks 谢谢
As I commented I can't reproduce the slowness, however here are some potential issues: 正如我评论的那样,我无法重现缓慢的问题,但是,这里存在一些潜在的问题:
It looks like length
does not change, but you are repeatedly invoking range(length)
. 看起来
length
不变,但是您反复调用range(length)
。 In Python 2.x, range
creates a list, and doing this repeatedly can slow things down (object creation is not cheap.) Use xrange
(or import a Py3-compatible iterator range
from six
or future
) and create the range once up front rather than each time. 在Python 2.x中,
range
创建一个列表,重复执行此操作可能会减慢速度(创建对象并不便宜。)使用xrange
(或从six
或future
导入Py3兼容的迭代器range
)并预先创建一次范围而不是每次。
i
is being reused here in a way that could cause problems. i
在这里以可能导致问题的方式重用。 You're trying to use it as the overall iteration count, but each of your list comprehensions that uses i
will overwrite i
in the scope of the function, which means that the "iteration" count will always end up as length - 1
. 您正在尝试将其用作整体迭代计数,但是使用
i
每个列表解析都将在函数范围内覆盖i
,这意味着“迭代”计数将始终以length - 1
结尾。
The lowest hanging fruit that I can see is in vectorization. 我能看到的最低限度的成果是矢量化。 You have a lot of list comprehensions;
您有很多列表理解; they're faster than for loops but have nothing on proper usage of numpy arrays.
它们比for循环要快,但是对正确使用numpy数组却没有任何帮助。
def grad_des_vec(xvalues, yvalues, R=0.01, epsilon=0.0001, MaxIterations=1000):
xvalues = np.array(xvalues)
yvalues = np.array(yvalues)
length = len(xvalues)
alpha = 1
beta = 1
converged = False
i = 0
cost = np.sum((alpha + beta * xvalues - yvalues)**2) / (2 * length)
start_time = time.time()
while not converged:
alpha_deriv = np.sum(alpha + beta * xvalues - yvalues) / length
beta_deriv = np.sum(
(alpha + beta * xvalues - yvalues) * xvalues) / length
alpha = alpha - R * alpha_deriv
beta = beta - R * beta_deriv
new_cost = np.sum((alpha + beta * xvalues - yvalues)**2) / (2 * length)
if abs(cost - new_cost) <= epsilon:
print('Converged')
print('Number of Iterations:', i)
converged = True
cost = new_cost
i = i + 1
if i == MaxIterations:
print('Maximum Iterations Exceeded')
converged = True
print("Time taken: " + str(round(time.time() - start_time, 2)) + " seconds")
return alpha, beta
For comparison 为了比较
In[47]: grad_des(xval, yval)
Converged
Number of Iterations: 198
Time taken: 0.66 seconds
Out[47]:
(0.28264882215511067, 0.53289263416071131)
In [48]: grad_des_vec(xval, yval)
Converged
Number of Iterations: 198
Time taken: 0.03 seconds
Out[48]:
(0.28264882215511078, 0.5328926341607112)
That's about a factor 20 speed up (xval and yval were both 1024 element arrays.). 这大约提高了20倍(xval和yval都是1024个元素数组)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.