二维线性回归系数

Question

I am doing linear regression with two dimensional variables: 我正在使用二维变量进行线性回归：

 filtered[['p_tag_x', 'p_tag_y', 's_tag_x', 's_tag_y']].head()

     p_tag_x      p_tag_y            s_tag_x     s_tag_y
35    589.665646  1405.580171        517.5       1636.5
36    589.665646  1405.580171        679.5       1665.5
100   610.546851  2425.303250        569.5       2722.0
101   610.546851  2425.303250        728.0       2710.0
102   717.237730  1411.842428        820.0       1616.5



clt = linear_model.LinearRegression()
clt.fit(filtered[['p_tag_x', 'p_tag_y']], filtered[['s_tag_x', 's_tag_y']])

I am getting following coefficients of the regression: 我得到以下回归系数：

clt.coef_

array([[ 0.4529769 , -0.22406594],
       [-0.00859452, -0.00816968]])

And the residues (X_0, and Y_0) 和残基（X_0和Y_0）

clt.residues_
array([ 1452.97816371,    69.12754694])

How I should I understand the above coefficients matrix in terms of the regression line ? 我应该如何用回归线理解上述系数矩阵？

Answer 1

As i already explained in the comments, you got an extra-dimension in your coef_ as well as intercept_ because you got 2 targets ( y.shape(n_samples, n_targets) ). 正如我在评论中已经解释的那样，因为您有2个目标 （ y.shape(n_samples, n_targets) ） y.shape(n_samples, n_targets)所以您在coef_和intercept_都有一个额外的维度。 In this case sklearn will fit 2 independent regressors , one for each target. 在这种情况下，sklearn将适合2个独立的回归变量 ，每个目标对应一个。

You then can just take those n regressors apart and handle each one on it's own. 然后，您可以将这n个回归变量拆开，并单独处理每个回归变量 。

The formula of your regression line is still: 回归线的公式仍然是：

y(w, x) = intercept_ + coef_[0] * x[0] + coef_[1] * x[1] ...

Sadly your example is a bit harder to visualize because of the dimensionality. 可悲的是，由于维度的原因，您的示例难以可视化。

Consider this a demo, with a lot of ugly hard-coding for this specific case (and bad example data!): 考虑一下这个演示，针对这种特定情况（以及不良的示例数据！）进行了很多难看的硬编码：

Code: 码：

# Warning: ugly demo-like code using a lot of hard-coding!!!!!

import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from sklearn import linear_model

X = np.array([[589.665646,  1405.580171],
              [589.665646,  1405.580171],
              [610.546851,  2425.303250],
              [610.546851,  2425.303250],
              [717.237730,  1411.842428]])

y = np.array([[517.5,       1636.5],
              [679.5,       1665.5],
              [569.5,       2722.0],
              [728.0,       2710.0],
              [820.0,       1616.5]])

clt = linear_model.LinearRegression()
clt.fit(X, y)

print(clt.coef_)
print(clt.residues_)

def curve_0(x, y):  # target 0; single-point evaluation hardcoded for 2 features!
    return clt.intercept_[0] + x * clt.coef_[0, 0] + y * clt.coef_[0, 1]

def curve_1(x, y):  # target 1; single-point evaluation hardcoded for 2 features!
    return clt.intercept_[1] + x * clt.coef_[1, 0] + y * clt.coef_[1, 1]

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')

xs = [np.amin(X[:, 0]), np.amax(X[:, 0])]
ys = [np.amin(X[:, 1]), np.amax(X[:, 1])]

# regressor 0
ax.scatter(X[:, 0], X[:, 1], y[:, 0], c='blue')
ax.plot([xs[0], xs[1]], [ys[0], ys[1]], [curve_0(xs[0], ys[0]), curve_0(xs[1], ys[1])], c='cyan')

# regressor 1
ax.scatter(X[:, 0], X[:, 1], y[:, 1], c='red')
ax.plot([xs[0], xs[1]], [ys[0], ys[1]], [curve_1(xs[0], ys[0]), curve_1(xs[1], ys[1])], c='magenta')

ax.set_xlabel('X[:, 0] feature 0')
ax.set_ylabel('X[:, 1] feature 1')
ax.set_zlabel('Y')

plt.show()

Output: 输出：

Remarks: 备注：

You don't have to calculate the formula by yourself: clt.predict() will do that! 您不必自己计算公式： clt.predict()可以做到！
The code-lines involving ax.plot(...) use the assumption, that our line is defined by just 2 points (linear)! 涉及ax.plot(...)的代码行使用以下假设：我们的行仅由2个点（线性）定义！

二维线性回归系数

问题描述

1 个解决方案

解决方案1
4 已采纳 2017-10-28 13:54:59

Code: 码：

Output: 输出：

Remarks: 备注：

二维线性回归系数

问题描述

1 个解决方案

解决方案1 4 已采纳 2017-10-28 13:54:59

Code: 码：

Output: 输出：

Remarks: 备注：

解决方案1
4 已采纳 2017-10-28 13:54:59