python中的多元线性回归而不适合原点？

Question

I found this chunk of code on http://rosettacode.org/wiki/Multiple_regression#Python , which does a multiple linear regression in python. 我在http://rosettacode.org/wiki/Multiple_regression#Python上找到了这段代码，该代码在python中进行了多元线性回归。 Print b in the following code gives you the coefficients of x1, ..., xN. 以下代码中的打印b为您提供x1，...，xN的系数。 However, this code is fitting the line through the origin (ie the resulting model does not include a constant). 但是，此代码适合通过原点的线（即，结果模型不包含常数）。

All I'd like to do is the exact same thing except I do not want to fit the line through the origin, I need the constant in my resulting model. 我只想做完全相同的事情，除了我不想使线穿过原点，我需要在结果模型中使用常量。

Any idea if it's a small modification to do this? 知道这是否是一个小的修改吗？ I've searched and found numerous documents on multiple regressions in python, except they are lengthy and overly complicated for what I need. 我已经搜索并找到了许多有关python中多个回归的文档，除了它们冗长且过于复杂之外，这对于我所需要的东西而言是不小的。 This code works perfect, except I just need a model that fits through the intercept not the origin. 这段代码非常完美，除了我只需要一个适合截距而不是原点的模型。

import numpy as np
from numpy.random import random

n=100
k=10
y = np.mat(random((1,n)))
X = np.mat(random((k,n)))

b = y * X.T * np.linalg.inv(X*X.T)
print(b)

Any help would be appreciated. 任何帮助，将不胜感激。 Thanks. 谢谢。

Answer 1

您只需要向X添加全为1的行即可。

Answer 2

Maybe a more stable approach would be to use a least squares algorithm anyway. 也许更稳定的方法还是使用最小二乘算法。 This can also be done in numpy in a few lines. 这也可以在numpy中完成几行。 Read the documentation about numpy.linalg.lstsq . 阅读有关numpy.linalg.lstsq的文档。

Here you can find an example implementation: 在这里，您可以找到示例实现：

http://glowingpython.blogspot.de/2012/03/linear-regression-with-numpy.html http://glowingpython.blogspot.de/2012/03/linear-regression-with-numpy.html

Answer 3

What you have written out, b = y * XT * np.linalg.inv(X * XT) , is the solution to the normal equations, which gives the least squares fit with a multi-linear model. 您所写的b = y * XT * np.linalg.inv(X * XT)是法线方程的解决方案，它为多线性模型提供了最小二乘拟合。 swang's response is correct (and EMS's elaboration)---you need to add a row of 1's to X. If you want some idea of why it works theoretically, keep in mind that you are finding b_i such that swang的响应是正确的（以及EMS的阐述）---您需要在X上加上1。如果您想了解其理论上的工作原理，请记住您正在寻找b_i使得

y_j = sum_i b_i x_{ij}.

By adding a row of 1's, you are are setting x_{(k+1)j} = 1 for all j , which means that you are finding b_i such that: 通过添加一行1，可以为所有j设置x_{(k+1)j} = 1 ，这意味着您将找到b_i，使得：

y_j = (sum_i b_i x_{ij}) + b_{k+1}

because the k+1 st x_ij term is always equal to one. 因为k+1 st x_ij项始终等于1。 Thus, b_{k+1} is your intercept term. 因此， b_{k+1}是您的截取项。

python中的多元线性回归而不适合原点？

问题描述

3 个解决方案

解决方案1
5 已采纳 2012-09-23 19:26:21

解决方案2
2 2012-09-23 19:11:57

解决方案3
1 2012-09-24 02:42:39

python中的多元线性回归而不适合原点？

问题描述

3 个解决方案

解决方案1 5 已采纳 2012-09-23 19:26:21

解决方案2 2 2012-09-23 19:11:57

解决方案3 1 2012-09-24 02:42:39

解决方案1
5 已采纳 2012-09-23 19:26:21

解决方案2
2 2012-09-23 19:11:57

解决方案3
1 2012-09-24 02:42:39