简体   繁体   English

如何使用python拟合多条指数曲线

[英]How to fit multiple exponential curves using python

For a single exponential curve such as shown in the image here curve_fit for as single exponential curve , I am able to fit the data using scipy.optimize.curve_fit.对于单个指数曲线,例如此处的图像中显示的curve_fit 作为单个指数曲线,我能够使用 scipy.optimize.curve_fit 拟合数据。 However, I am unsure on how to realize a fit for similar dataset composed of multiple exponential curves as shown here double exponential curves .但是,我不确定如何实现对由多条指数曲线组成的类似数据集的拟合,如下所示双指数曲线 I achieved the fit for the single curve using the following approach:我使用以下方法实现了对单曲线的拟合:

def exp_decay(x,a,r):
    return a * ((1-r)**x) 

x = np.linspace(0,50,50)
y = exp_decay(x, 400, 0.06)

y1 = exp_decay(x, 550, 0.06)      # this is to be used to append to y to generate two curves

pars, cov = curve_fit(exp_decay, x, y, p0=[0,0])
plt.scatter(x,y)
plt.plot(x, exp_decay(x, *pars), 'r-')     #this realizes the fit for a single curve

yx = np.append(y,y1)   #this realizes two exponential curves (as shown above - double exponential curves) for which I don't need to fit a model to

Can someone help describe how to achieve this for a dataset of two curves.有人可以帮助描述如何为两条曲线的数据集实现这一目标。 My actual dataset comprises of multiple exponential curves but I think if I can realize a fit for two curves, I may be able to replicate same for my dataset.我的实际数据集由多条指数曲线组成,但我认为如果我可以实现两条曲线的拟合,我可能能够为我的数据集复制相同的曲线。 This must not be done with scipy's curve_fit;这不能用 scipy 的 curve_fit 来完成; any implementation that works is fine.任何有效的实现都很好。

PLEASE HELP !!!请帮忙 !!!

Your problem can easily be tackled by splitting your dataset using a simple criterion such as first derivative estimate and then we can apply simple curve fitting procedure to each sub dataset.通过使用简单的标准(例如一阶导数估计)拆分数据集,可以轻松解决您的问题,然后我们可以将简单的曲线拟合程序应用于每个子数据集。

Trial Dataset试验数据集

First, let's import some packages and create a synthetic dataset with three curves to represent your problem.首先,让我们导入一些包并创建一个包含三个曲线的合成数据集来表示您的问题。

We use a two parameters exponential model as time origin shift will be handled by the splitting methodology.我们使用双参数指数模型,因为时间原点偏移将由拆分方法处理。 We also add noise as there is always noise on real world data:我们还添加了噪音,因为现实世界的数据总是存在噪音:

import numpy as np
import pandas as pd
from scipy import optimize
import matplotlib.pyplot as plt

def func(x, a, b):
    return a*np.exp(b*x)

N = 1001
n1 = N//3
n2 = 2*n1

t = np.linspace(0, 10, N)

x0 = func(t[:n1], 1, -0.2)
x1 = func(t[n1:n2]-t[n1], 5, -0.4)
x2 = func(t[n2:]-t[n2], 2, -1.2)

x = np.hstack([x0, x1, x2])
xr = x + 0.025*np.random.randn(x.size)

Graphically it renders as follow:它在图形上呈现如下:

在此处输入图片说明

Dataset Splitting数据集拆分

We can split the dataset into three sub-datasets using a simple criterion as first derivative estimate using first difference to assess it.我们可以使用简单的标准将数据集拆分为三个子数据集,作为使用一阶差分对其进行评估的一阶导数估计。 The goal is to detect when curve drastically goes up or down (where dataset should be split. First derivative is estimated as follow):目标是检测曲线何时急剧上升或下降(应分割数据集的位置。一阶导数估计如下):

dxrdt = np.abs(np.diff(xr)/np.diff(t))

The criterion requires an extra parameter (threshold) that must be tuned accordingly to your signal specifications.该标准需要一个额外的参数(阈值),必须根据您的信号规格进行相应调整。 The criterion is equivalent to:该标准相当于:

xcrit = 20
q = np.where(dxrdt > xcrit) # (array([332, 665], dtype=int64),)

And split index are:和拆分索引是:

idx = [0] + list(q[0]+1) + [t.size] # [0, 333, 666, 1001]

Mainly the criterion threshold will be affected by the nature and the power of the noise on your data and the gap magnitudes between two curves.主要是标准阈值会受到数据噪声的性质和功率以及两条曲线之间的差距幅度的影响。 The usage of this methodology depends on the ability to detect curves gap in presence of noise.这种方法的使用取决于在存在噪声的情况下检测曲线间隙的能力。 It will break when the noise power has the same magnitude of the gap we want to detect.当噪声功率与我们要检测的间隙大小相同时,它将中断。 You can also observe false split index if the noise is heavily tailed (few strong outliers).如果噪声严重拖尾(很少有强异常值),您还可以观察到错误的分割指数。

In this MCVE, we have set the threshold to 20 [Signal Units/Time Units] :在此 MCVE 中,我们将阈值设置为20 [Signal Units/Time Units]

在此处输入图片说明

An alternative to this hand-crafted criterion is to delegate the identification to the excellent find_peaks method of scipy .这种手工制作的标准的替代方法是鉴定委托给优良find_peaks的方法scipy But it will not avoid the requirement to tune the detection to your signal specifications.但它不会避免将检测调整到您的信号规格的要求。

Fit origin-shifted dataset拟合原点偏移数据集

Now we can apply the curve fitting on each sub-dataset (with origin shifted time), collect parameters and statistics and plot the result:现在我们可以在每个子数据集上应用曲线拟合(原点偏移时间),收集参数和统计数据并绘制结果:

trials = []
fig, axe = plt.subplots()
for k, (i, j) in enumerate(zip(idx[:-1], idx[1:])):
    p, s = optimize.curve_fit(func, t[i:j]-t[i], xr[i:j])
    axe.plot(t[i:j], xr[i:j], '.', label="Data #{}".format(k+1))
    axe.plot(t[i:j], func(t[i:j]-t[i], *p), label="Data Fit #{}".format(k+1))
    trials.append({"n0": i, "n1": j, "t0": t[i], "a": p[0], "b": p[1],
                   "s_a": s[0,0], "s_b": s[1,1], "s_ab": s[0,1]})
axe.set_title("Curve Fits")
axe.set_xlabel("Time, $t$")
axe.set_ylabel("Signal Estimate, $\hat{g}(t)$")
axe.legend()
axe.grid()
df = pd.DataFrame(trials)

It returns the following fitting results:它返回以下拟合结果:

    n0    n1    t0         a         b       s_a           s_b      s_ab
0    0   333  0.00  0.998032 -0.199102  0.000011  4.199937e-06 -0.000005
1  333   666  3.33  5.001710 -0.399537  0.000013  3.072542e-07 -0.000002
2  666  1001  6.66  2.002495 -1.203943  0.000030  2.256274e-05 -0.000018

Which complies with our original parameters (see Trial dataset section).这符合我们的原始参数(参见试验数据集部分)。

Graphically we can check the goodness of fits:我们可以通过图形检查拟合优度:

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM