简体   繁体   English

如何将简单线性回归与梯度下降并行化-使用numpy?

[英]How to parallelize simple linear regression with gradient descent - using numpy?

I'm having trouble using numpy to parallelize this for loop below (get_new_weights). 我在使用numpy并行化下面的for循环(get_new_weights)时遇到麻烦。 With my first attempt for df_dm in update_weights, the weight is completely wrong. 第一次尝试在update_weights中使用df_dm时,权重是完全错误的。 With my second attempt at df_dm, my weight overshoots the optimal weight. 在第二次尝试df_dm时,我的体重超过了最佳体重。

Note - bias is a single number and weight is a single number (one variable linear regression) and X is shape (442,1) and y is shape (442,1). 注意-偏差是一个数字,权重是一个数字(一个变量线性回归),X是形状(442,1),y是形状(442,1)。 Also note that updating my bias term works perfectly in update_weights - its just updating the weight that I'm having trouble with. 还要注意,更新我的偏见项在update_weights中非常有效-它只是更新我遇到问题的权重。

# This is the for loop that I am trying to parallelize with numpy:
def get_new_weights(X, y, weight, bias, learning_rate=0.01):
    weight_deriv = 0
    bias_deriv = 0
    total = len(X)
    for i in range(total):
        # -2x(y - (mx + b))
        weight_deriv += -2*X[i] * (y[i] - (weight*X[i] + bias))
        # -2(y - (mx + b))
        bias_deriv += -2*(y[i] - (weight*X[i] + bias))

    weight -= (weight_deriv / total) * learning_rate
    bias -= (bias_deriv / total) * learning_rate
    return weight, bias

# This is my attempt at parallelization
def update_weights(X, y, weight, bias, lr=0.01):
    df_dm = np.average(-2*X * (y-(weight*X+bias))) # this was my first guess
    # df_dm = np.average(np.dot((-X).T, ((weight*X+bias)-y))) # this was my second guess
    df_db = np.average(-2*(y-(weight*X+bias)))
    weight = weight - (lr*df_dm)
    bias = bias - (lr*df_db)
    return weight,bias

This is the equation I am using for updating my weight and bias: 这是我用来更新体重和偏见的方程式: 这是更新我的体重和偏见的方程式

thanks for everyone who took a look at my question. 感谢所有看过我的问题的人。 I am loosely using the term parallelization to refer to the optimization in terms of runtime that I'm looking for by removing the need for a for loop. 我松散地使用术语并行化来指代我正在寻找的运行时优化,因为它消除了对for循环的需求。 The answer to this problem is: 这个问题的答案是:

df_dm = (1/len(X)) * np.dot((-2*X).T, (y-(weight*X+bias)))

The issue here was making sure that all of the arrays resulting from the intermediate steps had the correct shape. 这里的问题是确保由中间步骤得到的所有数组都具有正确的形状。 And - for those interested in the runtime difference between these two functions: the for loop took 10 times longer. 对于那些对这两个函数之间的运行时差异感兴趣的人:for循环花费了10倍的时间。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM