[英]Multiple peaks curve fitting using lmfit library in python
I have a datafile where the first column is x-value, second column is y-value and third column is y-error.我有一个 数据文件,其中第一列是 x 值,第二列是 y 值,第三列是 y 错误。 I would like to fit the data.
我想拟合数据。 I am following the example from here and my code is-
我正在按照这里的示例进行操作,我的代码是-
import matplotlib.pyplot as plt
import numpy as np
from lmfit.models import ExponentialModel, GaussianModel
file='sample-data.txt'
dat = np.loadtxt(file)
x = dat[:, 0]
y = dat[:, 1]
exp_mod = ExponentialModel(prefix='exp_')
pars = exp_mod.guess(y, x=x)
gauss1 = GaussianModel(prefix='g1_')
pars.update(gauss1.make_params())
pars['g1_center'].set(value=105000, min=75000, max=125000)
pars['g1_sigma'].set(value=150000, min=30000)
pars['g1_amplitude'].set(value=2000000, min=100000)
gauss2 = GaussianModel(prefix='g2_')
pars.update(gauss2.make_params())
pars['g2_center'].set(value=155000, min=125000, max=175000)
pars['g2_sigma'].set(value=150000, min=30000)
pars['g2_amplitude'].set(value=2000000, min=100000)
mod = gauss1 + gauss2 + exp_mod
init = mod.eval(pars, x=x)
out = mod.fit(y, pars, x=x)
print(out.fit_report(min_correl=0.5))
fig, axes = plt.subplots(1, 2, figsize=(12.8, 4.8))
axes[0].plot(x, y, 'b')
axes[0].plot(x, init, 'k--', label='initial fit')
axes[0].plot(x, out.best_fit, 'r-', label='best fit')
axes[0].legend(loc='best')
comps = out.eval_components(x=x)
axes[1].plot(x, y, 'b')
axes[1].plot(x, comps['g1_'], 'g--', label='Gaussian component 1')
axes[1].plot(x, comps['g2_'], 'm--', label='Gaussian component 2')
axes[1].plot(x, comps['exp_'], 'k--', label='Exponential component')
axes[1].legend(loc='best')
plt.show()
This code is giving me following plot (the fit is not working)-这段代码给了我 plot (不合适)-
I am expecting something like this-我期待这样的事情-
Update更新
I tried using find_peaks suggested by @mikuszefski in the comment.我尝试在评论中使用@mikuszefski建议的 find_peaks。 But it is also picking up all the small peaks (noises) as the image shows.
但它也拾取了所有小峰值(噪声),如图所示。
Is there a way to choose the values only for the larger peaks?有没有办法只为较大的峰值选择值?
You really have to ("must", "are required to", "definitely under all circumstances") provide reasonable , finite, and plausible initial values for all the parameters.您确实必须(“必须”、“必须”、“绝对在所有情况下”)为所有参数提供合理的、有限的和似是而非的初始值。
When you say当你说
pars['g1_center'].set(value=105000, min=75000, max=125000)
pars['g1_sigma'].set(value=150000, min=30000)
pars['g1_amplitude'].set(value=2000000, min=100000)
gauss2 = GaussianModel(prefix='g2_')
pars.update(gauss2.make_params())
pars['g2_center'].set(value=155000, min=125000, max=175000)
pars['g2_sigma'].set(value=150000, min=30000)
pars['g2_amplitude'].set(value=2000000, min=100000)
You are (literally "literally") telling the program that Gaussian #1 should start with a center value of 105000, and cannot under any circumstance go beyond [75000, 125000].您(字面意思是“字面意思”)告诉程序高斯 #1 应该从中心值 105000 开始,并且在任何情况下都不能 go 超出 [75000, 125000]。
The data you provided and the plot shows that the two peaks you are interested in occur at x
values of around 1.4 and 1.5.您提供的数据和 plot 显示您感兴趣的两个峰出现在大约 1.4 和 1.5 的
x
值处。
So, the value for the center is around 1 and you told it the value was around 10^5 and could not go below 75,000.因此,中心的值约为 1,而您告诉它该值约为 10^5,并且 go 不能低于 75,000。 That is the best fit under those constraints.
这是在这些限制下的最佳选择。 The program worked without error or problem, and you got exactly what you asked for.
该程序正常运行,没有错误或问题,并且您得到了您所要求的。
Again, for non-linear least-squares problems and curve-fitting, initial values always matter.同样,对于非线性最小二乘问题和曲线拟合,初始值总是很重要。 There are no situations in which they do not matter.
没有任何情况下它们无关紧要。
That said, using a peak-finding algorithm as mikuszefski suggests is a fine choice.也就是说,使用 mikuszefski 建议的寻峰算法是一个不错的选择。
Aside: Bounds should be used primarily to constrain the logic/physics.旁白:边界应该主要用于约束逻辑/物理。 It might be reasonable to say the amplitude should be positive, for example.
例如,可以合理地说幅度应该是正的。 There is nothing intrinsic about Gaussians in general (or, probably, your data) that demands that a centroid value of 74999 is non-sensical.
一般来说,高斯(或者可能是您的数据)没有什么内在的要求质心值 74999 是不合理的。 So, do not start with such bounds.
所以,不要从这样的界限开始。 Start without bounds and as simply as possible.
开始没有界限,尽可能简单。 Add such complexity only when it is needed.
仅在需要时才添加这种复杂性。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.