[英]Non-linear least-square regression in Python
I have to calculate a non-linear least-square regression for my ~30 data points following the formula我必须按照公式为我的 ~30 个数据点计算非线性最小二乘回归
I tried the curve_fit
function out of scipy.optimize using the following code我使用以下代码尝试了curve_fit
函数
def func(x, p1 ,p2):
return p1*x/(1-x/p2)
popt, pcov = curve_fit(func, CSV[:,1], CSV[:,0])
p1 = popt[0]
p2 = popt[1]
with p1 and p2 being equivalent to A and C, respectively, and CSV being my data-array. p1 和 p2 分别相当于 A 和 C,CSV 是我的数据数组。 The functions runs without error message, but the outcome is not as expected.函数运行时没有错误消息,但结果与预期不同。 I've plotted the outcome of the function together with the original data points.我已经绘制了函数的结果和原始数据点。 I was not looking to get this nearly straight line (red line in plot), but something more close to the green line, which is simply a second order polynomial fit from Excel.我不想得到这条近乎直线的直线(图中的红线),而是更接近绿线的东西,这只是 Excel 中的二阶多项式拟合。 The green dashed line shows just a quick manual try to get closer to the polynomial fit.绿色虚线显示了一个快速的手动尝试,以接近多项式拟合。
wrong calcualtin of the fit-function, together with the original data points: 1拟合函数的错误计算以及原始数据点: 1
Does anyone has an idea how to make the calculation run as i want it to?有没有人知道如何按照我的意愿运行计算?
Your code is fine.你的代码没问题。 The data though is not easy to fit to.数据虽然不容易拟合。 There are too few points on the right side of the chart and too much noise on the left hand side.图表右侧的点太少,左侧的噪音太多。 This is why curve_fit fails.这就是 curve_fit 失败的原因。 Some ways to improve the solution could be:改进解决方案的一些方法可能是:
curve_fit() may not be the strongest tool. curve_fit() 可能不是最强大的工具。 See if you can get better results with other regression-type tools.看看您是否可以使用其他回归类型的工具获得更好的结果。
Below is the best I could get with your initial data and formula:以下是我对您的初始数据和公式所能得到的最好结果:
df = pd.read_csv("c:\\temp\\data.csv", header=None, dtype = 'float' )
df.columns = ('x','y')
def func(x, p1 ,p2):
return p1*x/(1-x/p2)
popt, pcov = curve_fit(func, df.x, df.y, maxfev=3000)
print('p1,p2:',popt)
p1, p2 = popt
y_pred = [ p1*x/(1-x/p2)+p3*x for x in range (0, 140, 5)]
plt.scatter(df.x, df.y)
plt.scatter(range (0, 140, 5), y_pred)
plt.show()
p1,p2: [-8.60771432e+02 1.08755430e-05] p1,p2: [-8.60771432e+02 1.08755430e-05]
I think i've figured out the best way to solve this problem by using the lmfit package ( https://lmfit.github.io/lmfit-py/v ).我想我已经找到了使用 lmfit 包( https://lmfit.github.io/lmfit-py/v )解决这个问题的最佳方法。 It worked best when i tried to fit the non-linear least-square regression not to the original data but to the fitting function provided by Excel (not very elegant, though).当我尝试将非线性最小二乘回归拟合到 Excel 提供的拟合函数而不是原始数据时,效果最好(虽然不是很优雅)。
from lmfit import Model
import matplotlib.pyplot as plt
import numpy as np
def func(x, o1 ,o2):
return o1*x/(1-x/o2)
xt = np.arange(0, 0.12, 0.005)
yt = 2.2268*np.exp(40.755*xt)
model = Model(func)
result = model.fit(yt, x=xt, o1=210, o2=0.118)
print(result.fit_report())
plt.plot(xt, yt, 'bo')
plt.plot(xt, result.init_fit, 'k--', label='initial fit')
plt.plot(xt, result.best_fit, 'r-', label='best fit')
plt.legend(loc='best')
plt.show
The results look pretty nice and the package is really easy to use (i've left out the final plot)结果看起来很不错,而且这个包真的很容易使用(我省略了最终的情节)
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 25
# data points = 24
# variables = 2
chi-square = 862.285318
reduced chi-square = 39.1947872
Akaike info crit = 89.9567771
Bayesian info crit = 92.3128848
[[Variables]]
o1: 310.243771 +/- 12.7126811 (4.10%) (init = 210)
o2: 0.13403974 +/- 0.00120453 (0.90%) (init = 0.118)
[[Correlations]] (unreported correlations are < 0.100)
C(o1, o2) = 0.930
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.