简体   繁体   English

Python中的非线性最小二乘回归

[英]Non-linear least-square regression in Python

I have to calculate a non-linear least-square regression for my ~30 data points following the formula我必须按照公式为我的 ~30 个数据点计算非线性最小二乘回归

非线性最小二乘公式

I tried the curve_fit function out of scipy.optimize using the following code我使用以下代码尝试了curve_fit函数

def func(x, p1 ,p2):
  return p1*x/(1-x/p2)

popt, pcov = curve_fit(func, CSV[:,1], CSV[:,0])

p1 = popt[0]
p2 = popt[1]

with p1 and p2 being equivalent to A and C, respectively, and CSV being my data-array. p1 和 p2 分别相当于 A 和 C,CSV 是我的数据数组。 The functions runs without error message, but the outcome is not as expected.函数运行时没有错误消息,但结果与预期不同。 I've plotted the outcome of the function together with the original data points.我已经绘制了函数的结果和原始数据点。 I was not looking to get this nearly straight line (red line in plot), but something more close to the green line, which is simply a second order polynomial fit from Excel.我不想得到这条近乎直线的直线(图中的红线),而是更接近绿线的东西,这只是 Excel 中的二阶多项式拟合。 The green dashed line shows just a quick manual try to get closer to the polynomial fit.绿色虚线显示了一个快速的手动尝试,以接近多项式拟合。

wrong calcualtin of the fit-function, together with the original data points: 1拟合函数的错误计算以及原始数据点: 1

Does anyone has an idea how to make the calculation run as i want it to?有没有人知道如何按照我的意愿运行计算?

Your code is fine.你的代码没问题。 The data though is not easy to fit to.数据虽然不容易拟合。 There are too few points on the right side of the chart and too much noise on the left hand side.图表右侧的点太少,左侧的噪音太多。 This is why curve_fit fails.这就是 curve_fit 失败的原因。 Some ways to improve the solution could be:改进解决方案的一些方法可能是:

  • raising maxfev parameter for curve_fit() see here提高 curve_fit() 的 maxfev 参数请参见此处
  • giving starting values to curve_fit() - see same place为 curve_fit() 提供起始值 - 见同一个地方
  • add more data points添加更多数据点
  • use more parameters in the function or different function.在函数或不同的函数中使用更多参数。

curve_fit() may not be the strongest tool. curve_fit() 可能不是最强大的工具。 See if you can get better results with other regression-type tools.看看您是否可以使用其他回归类型的工具获得更好的结果。

Below is the best I could get with your initial data and formula:以下是我对您的初始数据和公式所能得到的最好结果:

df = pd.read_csv("c:\\temp\\data.csv", header=None, dtype = 'float' )
df.columns = ('x','y')

def func(x,  p1 ,p2):
    return p1*x/(1-x/p2)

popt, pcov = curve_fit(func, df.x, df.y,  maxfev=3000)
print('p1,p2:',popt)
p1, p2 = popt

y_pred = [ p1*x/(1-x/p2)+p3*x for x in range (0, 140, 5)]
plt.scatter(df.x, df.y)
plt.scatter(range (0, 140, 5), y_pred)

plt.show()

p1,p2: [-8.60771432e+02 1.08755430e-05] p1,p2: [-8.60771432e+02 1.08755430e-05]

在此处输入图片说明

I think i've figured out the best way to solve this problem by using the lmfit package ( https://lmfit.github.io/lmfit-py/v ).我想我已经找到了使用 lmfit 包( https://lmfit.github.io/lmfit-py/v )解决这个问题的最佳方法。 It worked best when i tried to fit the non-linear least-square regression not to the original data but to the fitting function provided by Excel (not very elegant, though).当我尝试将非线性最小二乘回归拟合到 Excel 提供的拟合函数而不是原始数据时,效果最好(虽然不是很优雅)。

from lmfit import Model
import matplotlib.pyplot as plt
import numpy as np

def func(x,  o1 ,o2):
    return o1*x/(1-x/o2) 

xt = np.arange(0, 0.12, 0.005)
yt = 2.2268*np.exp(40.755*xt)

model = Model(func)
result = model.fit(yt, x=xt, o1=210, o2=0.118)

print(result.fit_report())

plt.plot(xt, yt, 'bo')
plt.plot(xt, result.init_fit, 'k--', label='initial fit')
plt.plot(xt, result.best_fit, 'r-', label='best fit')
plt.legend(loc='best') 
plt.show

The results look pretty nice and the package is really easy to use (i've left out the final plot)结果看起来很不错,而且这个包真的很容易使用(我省略了最终的情节)

[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 25
    # data points      = 24
    # variables        = 2
    chi-square         = 862.285318
    reduced chi-square = 39.1947872
    Akaike info crit   = 89.9567771
    Bayesian info crit = 92.3128848
[[Variables]]
    o1:  310.243771 +/- 12.7126811 (4.10%) (init = 210)
    o2:  0.13403974 +/- 0.00120453 (0.90%) (init = 0.118)
[[Correlations]] (unreported correlations are < 0.100)
    C(o1, o2) =  0.930

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM