简体   繁体   English

Python中的梯度下降算法

[英]Gradient Descent Algorithm in Python

I am trying to write a gradient descent function in python as part of a multivariate linear regression exercise. 我正在尝试在python中编写梯度下降函数,作为多元线性回归练习的一部分。 It runs, but does not compute the correct answer. 它可以运行,但不能计算出正确的答案。 My code is below. 我的代码如下。 I've been trying for weeks to finish this problem but have made zero progress. 我已经尝试了数周来解决此问题,但取得了零进展。

I believe that I understand the concept of gradient descent to optimize a multivariate linear regression function and also that the 'math' is correct. 我相信我了解梯度下降的概念以优化多元线性回归函数,并且“算术”正确。 I believe that the error is in my code, but I am still learning python. 我相信错误是在我的代码中,但我仍在学习python。 Your help is very much appreciated. 非常感激你的帮助。

def regression_gradient_descent(feature_matrix,output,initial_weights,step_size,tolerance):
    from math import sqrt
    converged = False
    weights = np.array(initial_weights)
    while not converged:
        predictions = np.dot(feature_matrix,weights)
        errors = predictions - output
        gradient_sum_squares = 0
        for i in range(len(weights)):
            derivative = -2 * np.dot(errors[i],feature_matrix[i])
            gradient_sum_squares = gradient_sum_squares + np.dot(derivative, derivative)
            weights[i] = weights[i] - step_size * derivative[i]
        gradient_magnitude = sqrt(gradient_sum_squares)
        print gradient_magnitude
        if gradient_magnitude < tolerance:
            converged = True
    return(weights)

Feature matrix is: 特征矩阵为:

sales = gl.SFrame.read_csv('kc_house_data.csv',column_type_hints = {'bathrooms':float, 'waterfront':int, 'sqft_above':int, 'sqft_living15':float,'grade':int, 'yr_renovated':int, 'price':float, 'bedrooms':float, 'zipcode':str,'long':float, 'sqft_lot15':float, 'sqft_living':float, 'floors':str, 'condition':int,'lat':float, 'date':str, 'sqft_basement':int, 'yr_built':int, 'id':str, 'sqft_lot':int,'view':int})

I'm calling the function as: 我将函数称为:

train_data,test_data = sales.random_split(.8,seed=0)
simple_features = ['sqft_living']
my_output= 'price'
(simple_feature_matrix, output) = get_numpy_data(train_data, simple_features, my_output)
initial_weights = np.array([-47000., 1.])
step_size = 7e-12
tolerance = 2.5e7    
simple_weights = regression_gradient_descent(simple_feature_matrix, output,initial_weights,step_size,tolerance)

**get_numpy_data is just a function to convert everything into arrays and works as intended ** get_numpy_data只是一个将所有内容转换为数组并按预期工作的函数

Update: I fixed the formula to: 更新:我将公式固定为:

derivative = 2 * np.dot(errors,feature_matrix)

and it seems to have worked. 而且似乎有效。 The derivation of this formula in my online course used 我使用的在线课程中此公式的推导

-2 * np.dot(errors,feature_matrix)

and I'm not sure why this formula did not provide the correct answer. 而且我不确定为什么这个公式不能提供正确的答案。

The step size seems too small, and the tolerance unusually big. 步长似乎太小,公差异常大。 Perhaps you meant to use them the other way around? 也许您是想以其他方式使用它们?

In general, the step size is determined by a trial-and-error procedure: the "natural" step size α=1 might lead to divergence, so one could try to lower the value (eg taking α=1/2 , α=1/4 , etc until convergence is achieved. Don't start with a very small step size. 通常,步长是通过反复试验程序确定的:“自然”步长α=1可能会导致发散,因此可以尝试降低该值(例如,取α=1/2α=1/4等,直到实现收敛为止,不要以很小的步长开始。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM