使用sklearn的numpy多项式线性回归

Question

I am trying to fit a linear system of polynomials to data. 我正在尝试将多项式的线性系统拟合到数据。 numpy 's polynomial module has a fitting function included, which works perfectly. numpy的polynomial模块包含一个拟合函数，该函数可以完美运行。 When I try to fit the model with an sklearn linear solver, the fit is terrible! 当我尝试使用sklearn线性求解器拟合模型时，拟合非常糟糕！ I don't understand what is going wrong. 我不明白怎么了。 I construct a matrix X where x_{ij} corresponds to the i th observed input and the j th polynomial. 我构造了一个矩阵X，其中x_ {ij}对应于第i个观察到的输入和第j个多项式。 I know the X matrix is OK because, when I find the coefficients with numpy , the data fits perfectly. 我知道X矩阵还可以，因为当我用numpy查找系数时，数据就非常合适。 I use sklearn's fit function (I have tried several linear solvers), but the coefficients it solves for (the coef_ object) are just wrong. 我使用了sklearn的fit函数（我已经尝试了几种线性求解器），但是它所求解的系数（ coef_对象）是错误的。 What am I doing wrong? 我究竟做错了什么？ How can I make the coefficients found by the sklearn linear solver match the coefficients found by numpy ? 如何使sklearn线性求解器找到的系数与numpy找到的系数匹配？

import numpy as np
from sklearn import linear_model
from sklearn.linear_model import OrthogonalMatchingPursuit
import matplotlib.pyplot as plt

# accept x and polynomial order, return basis of that order
def legs(x, c):
   s = np.zeros(c + 1)
   s[-1] = 1
   return np.polynomial.legendre.legval(x, s)

# Generate normalized samples   
samples = np.random.uniform(2, 3, 5)
evals = samples ** 2
xnorm = (samples - 2) * 2 / (3 - 2) - 1

# instantiate linear regressor
omp = linear_model.LinearRegression()
#omp = linear_model.Lasso(alpha=0.000001)
#omp = OrthogonalMatchingPursuit(n_nonzero_coefs=2)

# construct X matrix. Each row is an observed value. 
#  Each column is a different polynomial.
X = np.array([[legs(xnorm[jj], ii) for ii in range(5)] for jj in range(xnorm.size)])

# Perform the fit. Why isn't this working?
omp.fit(X, evals)

# Plot the truth data
plt.scatter(xnorm, evals, label='data', s=15, marker='x')

# Dot the coefficients found with sklearn against X
plt.scatter(xnorm, omp.coef_.dot(X.T), label='linear regression')

# Dot the coefficients found with numpy against X
plt.scatter(xnorm, np.polynomial.legendre.legfit(xnorm, evals, 4).dot(X.T), label='Numpy regression')

# complete the plot
plt.legend(ncol=3, prop={'size':3})
plt.savefig('simpleExample')
plt.clf()

Answer 1

Your omp.coef_.dot(XT) doesn't include the intercept; 您的omp.coef_.dot(XT)不包含截距； add that manually or simply use omp.predict directly. 手动添加或直接使用omp.predict 。

Ie: 即：

plt.scatter(xnorm, omp.coef_.dot(X.T) + omp.intercept_, label='linear regression')
plt.scatter(xnorm, evals, label='data', s=15, marker='x')

or 要么

plt.scatter(xnorm, omp.predict(X), label='linear regression')
plt.scatter(xnorm, evals, label='data', s=15, marker='x')

使用sklearn的numpy多项式线性回归

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-09-07 19:03:29

使用sklearn的numpy多项式线性回归

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-09-07 19:03:29

解决方案1
1 已采纳 2019-09-07 19:03:29