简体   繁体   中英

Is there a way to plot the ordinary least squares type of line on another plot?

I currently have a scatter plot of data points, and I want to draw a line that captures the general pattern of the data. I believe that this is also known as an ordinary least squares regression method, but I may be wrong as I'm not completely familiar with the literature.

For example, if I had a plot like the following:

在此处输入图片说明

I just want a line that goes through the data points, that captures the most general trend.

I've tried methods like using Scikit-Learn's LinearRegression module, but I'll have to split my data into train and test sets and perform regression. Is there a way that I can just capture the general trend without having to do this?

Thank you.

Here is an example polynomial fitter that does this, if you convert your date format to a numeric type such as "elapsed days" you can directly substitute your data into the example. Here I use a curved second-order polynomial (quadratic) equation, set at the top of the code, because to my eye the trend of your data appears to have some curvature rather than a straight line.

阴谋

import numpy, matplotlib
import matplotlib.pyplot as plt

xData = numpy.array([1.1, 2.2, 3.3, 4.4, 5.0, 6.6, 7.7, 0.0])
yData = numpy.array([1.1, 20.2, 30.3, 40.4, 50.0, 60.6, 70.7, 0.1])

polynomialOrder = 2 # example quadratic

# curve fit the test data
fittedParameters = numpy.polyfit(xData, yData, polynomialOrder)
print('Fitted Parameters:', fittedParameters)

modelPredictions = numpy.polyval(fittedParameters, xData)
absError = modelPredictions - yData

SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))
print('RMSE:', RMSE)
print('R-squared:', Rsquared)

print()


##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
    f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
    axes = f.add_subplot(111)

    # first the raw data as a scatter plot
    axes.plot(xData, yData,  'D')

    # create data for the fitted equation plot
    xModel = numpy.linspace(min(xData), max(xData))
    yModel = numpy.polyval(fittedParameters, xModel)

    # now the model as a line plot
    axes.plot(xModel, yModel)

    axes.set_xlabel('X Data') # X axis data label
    axes.set_ylabel('Y Data') # Y axis data label

    plt.show()
    plt.close('all') # clean up after using pyplot

graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM