简体   繁体   English

在线性回归上使用梯度下降会产生不正确的偏差

[英]Using Gradient Descent on Linear Regression Yields an Incorrect Bias

I've got a toy example set up of a linear regression model with one input variable and one output variable.我有一个包含一个输入变量和一个输出变量的线性回归模型的玩具示例。 The problem that I'm encountering is that the output for the bias is far off from the generated data.我遇到的问题是偏差的输出与生成的数据相去甚远。 If I manually set the bias then it will produce a weight and bias which is close enough to the original.如果我手动设置偏差,那么它会产生足够接近原始的权重和偏差。

I've written two pieces of code gen_data which generates data and GradientDescent which performs that gradient descent algorithm to find the weight and bias.我写了两段代码gen_data生成数据和GradientDescent执行梯度下降算法来找到权重和偏差。

def gen_data(num_points=50, slope=1, bias=10, x_max=50):
    f = lambda z: slope * z + bias
    x = np.zeros(shape=(num_points, 1))
    y = np.zeros(shape=(num_points, 1))

    for i in range(num_points):
        x_temp = np.random.uniform()*x_max
        x[i] = x_temp
        y[i] = f(x_temp) + np.random.normal(scale=3.0)

    return (x, y)

# \mathbb{R}^1 with no regularization
def gradientDescent2(x, y, learning_rate=0.0001, epochs=100):
    theta = np.random.rand()
    bias = np.random.rand()

    for i in range(0, epochs):
        loss = (theta * x + bias) - y
        cost = np.mean(loss**2) / 2
        # print('Iteration {} | Cost: {}'.format(i, cost))

        grad_b = np.mean(loss)
        grad_t = np.mean(loss*x)

        # updates
        bias -= learning_rate * grad_b
        theta -= learning_rate * grad_t

    return (theta, bias)
  1. If you want to use batch update, don't set your batch_size equals to your simple size.如果您想使用批量更新,请不要将您的 batch_size 设置为您的简单大小。 (I also believe that batch_update is not very suitable for this case.) (我也认为 batch_update 不太适合这种情况。)

2.Your gradient calculation and parameter update are incorrect, the gradient should be: 2.你的梯度计算和参数更新不正确,梯度应该是:

grad_b =  1
grad_t =  x

For the parameter update, you should always trying to minimize the loss , so it should be对于参数更新,你应该总是试图最小化loss ,所以它应该是

if loss>0:
  bias -= learning_rate * grad_b
  theta -= learning_rate * grad_t
elif loss< 0:
  bias += learning_rate * grad_b
  theta += learning_rate * grad_t

After all, below is the modified code works well.毕竟,下面是修改后的代码运行良好。 import numpy as np import sys将 numpy 导入为 np import sys

def gen_data(num_points=500, slope=1, bias=10, x_max=50):
    f = lambda z: slope * z + bias
    x = np.zeros(shape=(num_points))
    y = np.zeros(shape=(num_points))

    for i in range(num_points):
        x_temp = np.random.uniform()*x_max
        x[i] = x_temp
        y[i] = f(x_temp) #+ np.random.normal(scale=3.0)
        #print('x:',x[i],'        y:',y[i])

    return (x, y)

def gradientDescent2(x, y, learning_rate=0.001, epochs=100):
    theta = np.random.rand()
    bias = np.random.rand()

    for i in range(0, epochs):
      for j in range(len(x)):
        loss = (theta * x[j] + bias) - y[j]
        cost = np.mean(loss**2) / 2
        # print('Iteration {} | Cost: {}'.format(i, cost))

        grad_b = 1
        grad_t = x[j]

        if loss>0:
            bias -= learning_rate * grad_b
            theta -= learning_rate * grad_t
        elif loss< 0:
            bias += learning_rate * grad_b
            theta += learning_rate * grad_t

    return (theta, bias)

def main():
    x,y =gen_data()
    ta,bias = gradientDescent2(x,y)
    print('theta:',ta)
    print('bias:',bias)

if __name__ == '__main__':
    sys.exit(int(main() or 0))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM