简体   繁体   English

这个基本的线性回归有什么问题?

[英]What is the issue with is this basic linear regression?

Currently trying to run a very basic linear regression using some test data points on a jupyter notebook.目前正在尝试使用 jupyter notebook 上的一些测试数据点运行非常基本的线性回归。 below is my code, and as you can see if you run this the prediction line certainly moves towards where it should go but then it stops for some reason and I'm not really sure why.下面是我的代码,正如你所看到的,如果你运行它,预测线肯定会移动到它应该在 go 的位置,但是由于某种原因它停止了,我不确定为什么。 Can anyone help me?谁能帮我?

starting weights起始重量
起始重量

ending weights结束权重
结束权重

Loss失利
失利

import matplotlib.pyplot as plt
import numpy as np
%matplotlib notebook
plt.style = "ggplot"

y = np.array([30,70,90,120,150,160,190,220])
x = np.arange(2,len(y)+2)

N = len(y)
weights = np.array([0.2,0.2])

plt.figure()
plt.scatter(x, y, color="red")
plt.plot(y_hat)
x_ticks = np.array([[1,x*0.1] for x in range(100)])
y_hat = []
for j in range(len(x_ticks)):
    y_hat.append(np.dot(weights, x_ticks[j]))

def plot_model(x, y, weights, loss):
    x_ticks = np.array([[1,x*0.1] for x in range(100)])
    y_hat = []
    for j in range(len(x_ticks)):
        y_hat.append(np.dot(weights, x_ticks[j]))

    plt.figure()
    plt.scatter(x, y, color="red")
    plt.plot(y_hat)
    plt.figure()
    plt.plot(loss)

def calculate_grad(weights, N, x_proc, y, loss):
    residuals = np.sum(y.reshape(N,1) - weights*x_proc, 1)
    loss.append(sum(residuals**2)/2)
    #print(residuals, x_proc)
    return -np.dot(residuals, x_proc)

def adjust_weights(weights, grad, learning_rate):
    weights -= learning_rate*grad
    return weights

learning_rate = 0.006
epochs = 2000
loss = []

x_processed = np.array([[1,i] for i in x])

for j in range(epochs):
    grad = calculate_grad(weights, N, x_processed, y, loss)
    weights = adjust_weights(weights, grad, learning_rate)
    if j % 200 == 0:
        print(weights, grad)
plot_model(x, y, weights, loss)

There are a couple of problems.有几个问题。

First let's talk about the manner in which you are trying to find your parameters.首先让我们谈谈您尝试查找参数的方式。 You have some issues with the way you're doing your matrix and vector multiplications.您在进行矩阵和向量乘法时遇到了一些问题。 I like to visualize the weights AND y as a column vector.我喜欢将权重和 y 可视化为列向量。

Then, you'll know that we need to take the dot product of your processed x matrix and weights column vector.然后,您就会知道我们需要对您处理后的 x 矩阵和权重列向量进行点积。 That's step 1.这是第 1 步。

Now, remember your chain rule, You were on the right track with your gradient, but you need to remember to multiply x_proc * residuals by (-2/n) where n is the number of observations you have!现在,记住你的链式法则,你的梯度是正确的,但你需要记住将x_proc * residuals乘以 (-2/n),其中 n 是你拥有的观察次数!

Here's that code:这是代码:

y = np.array([[30,70,90,120,150,160,190,220]]).T
x = np.arange(2,len(y)+2)

N = len(y)
weights = np.array([0.2,0.2])

def calculate_grad(weights, N, x_proc, y, loss):
    y_hat = np.dot(x_proc, weights.reshape(2,1))
    residuals = y - y_hat
    gradient = (-2/float(len(x_proc)))*sum(x_proc * residuals)
    return gradient 

def adjust_weights(weights, grad, learning_rate):
    weights -= learning_rate*grad
    return weights

Now for the plotting issue.现在是绘图问题。

There's no need to increment by 0.1 on the x.不需要在 x 上增加 0.1。 You should simply use x_proc as you did when finding your weights.您应该像查找权重时那样简单地使用 x_proc。 Like so:像这样:

def plot_model(x, y, weights, loss):
    y_hat = []
    for j in x:
        y_hat.append(np.dot(weights, [1, j]))

    plt.figure()
    plt.scatter(x, y, color="red")
    plt.plot(x, y_hat)
    plt.show()

And tada, with 2000 iterations you get weights: [-12.80036278 25.75042317] which is so close to the actual solution: [-13.33333 25.833333] . tada,经过 2000 次迭代,您将获得权重: [-12.80036278 25.75042317]与实际解决方案非常接近: [-13.33333 25.833333]

Here's working code:这是工作代码:

import numpy as np
import matplotlib.pyplot as plt 

y = np.array([[30,70,90,120,150,160,190,220]]).T
x = np.arange(2,len(y)+2)

N = len(y)
weights = np.array([0.2,0.2])


def plot_model(x, y, weights, loss):
    y_hat = []
    for j in x:
        y_hat.append(np.dot(weights, [1, j]))

    plt.figure()
    plt.scatter(x, y, color="red")
    plt.plot(x, y_hat)
    plt.show()

def calculate_grad(weights, N, x_proc, y, loss):
    y_hat = np.dot(x_proc, weights.reshape(2,1))
    residuals = y - y_hat
    gradient = (-2/float(len(x_proc)))*sum(x_proc * residuals)
    return gradient 

def adjust_weights(weights, grad, learning_rate):
    weights -= learning_rate*grad
    return weights

learning_rate = 0.006
epochs = 2000
loss = []

x_processed = np.array([[1,i] for i in x])

for j in range(epochs):
    grad = calculate_grad(weights, N, x_processed, y, loss)
    weights = adjust_weights(weights, grad, learning_rate)

plot_model(x, y, weights, loss)

I think it would be helpful for me to explain to you how I solved this.我认为向您解释我是如何解决这个问题的。

  1. I started on pen and paper.我开始用笔和纸。 It's not so important to find numerical values, but you must understand the order of operations (for example, multiplying the matrix x by a column vector of weights and not vice versa).找到数值并不那么重要,但您必须了解运算的顺序(例如,将矩阵 x 乘以权重的列向量,反之亦然)。 Your code confused me because I was unsure of the mental map you had built for your order of operations.您的代码使我感到困惑,因为我不确定您为操作顺序构建的心理 map。

  2. Then from there, writing the code is easy.然后从那里开始,编写代码很容易。

  3. If you want to check whether your solution is right, you can use the closed form solution (if there is one;) ) to the least squares problem: inverse(XT X) (XT*y).如果您想检查您的解决方案是否正确,您可以使用封闭形式的解决方案(如果有的话;))来解决最小二乘问题:inverse(XT X) (XT*y)。 In your case:在你的情况下:

y = np.array([[30,70,90,120,150,160,190,220]]).T
x = np.arange(2,len(y)+2)
x = np.matrix([[1, x] for x in x])

beta_pt1 = np.linalg.inv(x.T*x)
beta_pt2 = x.T*y
beta = beta_pt1*beta_pt2
print(beta_pt1*beta_pt2)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM