I want to do a linear regression for a scatter plot using polyfit, and I also want the residual to see how good the linear regression is. But I am unsure how I get this as it isn't possible to get the residual as an output value from polyfit since this is one dimensional. My code:
p = np.polyfit(lengths, breadths, 1)
m = p[0]
b = p[1]
yfit = np.polyval(p,lengths)
newlengths = []
for y in lengths:
newlengths.append(y*m+b)
ax.plot(lengths, newlengths, '-', color="#2c3e50")
I saw a stackoverflow answer where they used polyval - but I am unsure of what that gives me. Is that the exact values for the lengths? Should I find the error by finding the delta of each element from the polyval and 'breadth'?
You can use the keyword full=True
when calling polyfit
(see http://docs.scipy.org/doc/numpy/reference/generated/numpy.polyfit.html ) to get the least-square error of your fit:
coefs, residual, _, _, _ = np.polyfit(lengths, breadths, 1, full=True)
You can get the same answer by doing:
coefs = np.polyfit(lengths, breadths, 1)
yfit = np.polyval(coefs,lengths)
residual = np.sum((breadths-yfit)**2)
or
residual = np.std(breadths-yfit)**2 * len(breadths)
Additionally, if you want to plot the residuals, you can do:
coefs = np.polyfit(lengths, breadths, 1)
yfit = np.polyval(coefs,lengths)
plot(lengths, breadths-yfit)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.