简体   繁体   中英

Get the least squares straight line for a set of points

For a set of points, I want to get the straight line that is the closest approximation of the points using a least squares fit.

I can find a lot of overly complex solutions here on SO and elsewhere but I have not been able to find something simple. And this should be very simple.

x = np. array([1, 2, 3, 4])
y = np. array([23, 31, 42, 43 ])
slope, intercept = leastSquares(x, y)

Is there some library function that implements the above leastSquares()?

Well for one, I think for an ordinary least squares fit with a single line you can derive a closed-form solution for the coefficients, if I'm not utterly mistaken. Though there's some pitfalls with numerical stability.

If you look for least squares in general, you'll find more general and thus more complex solutions, because least squares can be done for many more models than just the linear one.

But maybe the sklearn package with its LinearRegression model might do easily what you want to do? https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html

or for more detailed control the scipy package, https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.lstsq.html

import numpy as np
from scipy.linalg import lstq

# Turn x into 2d array; raise it to powers 0 (for y-axis-intercept)
# and 1 (for the slope part)
M = x[:, np.newaxis] ** [0, 1]

p, res, rnk, s = lstq(M, y)

intercept, slope = p[0], p[1]

Here's one way to implement the least squares regression:

import numpy as np

x = np. array([1, 2, 3, 4])
y = np. array([23, 31, 42, 43 ])

def leastSquares(x, y):
    A = np.vstack([x, np.ones(len(x))]).T
    y = y[:, np.newaxis]
    slope, intercept = np.dot((np.dot(np.linalg.inv(np.dot(A.T,A)),A.T)),y)
    return slope, intercept

slope, intercept = leastSquares(x, y)

numpy.linalg.lstsq can compute such a fit for you. There is an example in the documentation that does exactly what you need.

https://numpy.org/doc/stable/reference/generated/numpy.linalg.lstsq.html#numpy-linalg-lstsq

To summarize it here …

>>> x = np.array([0, 1, 2, 3])
>>> y = np.array([-1, 0.2, 0.9, 2.1])
>>> A = np.stack([x, np.ones(len(x))]).T

>>> m, c = np.linalg.lstsq(A, y, rcond=None)[0]
>>> m, c
(1.0 -0.95) # may vary

You can try with Moore-Penrose pseudo-inverse:

from scipy import linalg
x = np. array([1, 2, 3, 4])
y = np. array([23, 31, 42, 43 ])
x = np.array([x, np.ones(len(x))]) 
B = linalg.pinv(x)
sol = np.reshape(y,(1,len(y))) @ B 
slope, intercept = sol[0,0], sol[0,1] 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM