简体   繁体   English

在 python 中使用 lmfit 库进行多峰曲线拟合

[英]Multiple peaks curve fitting using lmfit library in python

I have a datafile where the first column is x-value, second column is y-value and third column is y-error.我有一个 数据文件,其中第一列是 x 值,第二列是 y 值,第三列是 y 错误。 I would like to fit the data.我想拟合数据。 I am following the example from here and my code is-我正在按照这里的示例进行操作,我的代码是-

import matplotlib.pyplot as plt
import numpy as np

from lmfit.models import ExponentialModel, GaussianModel


file='sample-data.txt'
dat = np.loadtxt(file)
x = dat[:, 0]
y = dat[:, 1]



exp_mod = ExponentialModel(prefix='exp_')
pars = exp_mod.guess(y, x=x)

gauss1 = GaussianModel(prefix='g1_')
pars.update(gauss1.make_params())

pars['g1_center'].set(value=105000, min=75000, max=125000)
pars['g1_sigma'].set(value=150000, min=30000)
pars['g1_amplitude'].set(value=2000000, min=100000)

gauss2 = GaussianModel(prefix='g2_')
pars.update(gauss2.make_params())

pars['g2_center'].set(value=155000, min=125000, max=175000)
pars['g2_sigma'].set(value=150000, min=30000)
pars['g2_amplitude'].set(value=2000000, min=100000)

mod = gauss1 + gauss2 + exp_mod

init = mod.eval(pars, x=x)
out = mod.fit(y, pars, x=x)

print(out.fit_report(min_correl=0.5))

fig, axes = plt.subplots(1, 2, figsize=(12.8, 4.8))
axes[0].plot(x, y, 'b')
axes[0].plot(x, init, 'k--', label='initial fit')
axes[0].plot(x, out.best_fit, 'r-', label='best fit')
axes[0].legend(loc='best')

comps = out.eval_components(x=x)
axes[1].plot(x, y, 'b')
axes[1].plot(x, comps['g1_'], 'g--', label='Gaussian component 1')
axes[1].plot(x, comps['g2_'], 'm--', label='Gaussian component 2')
axes[1].plot(x, comps['exp_'], 'k--', label='Exponential component')
axes[1].legend(loc='best')

plt.show()

This code is giving me following plot (the fit is not working)-这段代码给了我 plot (不合适)- 在此处输入图像描述

I am expecting something like this-我期待这样的事情-

在此处输入图像描述

  1. Can anyone help me fitting the data in the plot?谁能帮我在 plot 中拟合数据?
  2. Also in the example value, min, max for center, sigma, and amplitude were defined manually.同样在示例值中,中心的最小值、最大值、西格玛和幅度是手动定义的。 Is there any way to get/ calculate those values from the data file?有没有办法从数据文件中获取/计算这些值?

Update更新

I tried using find_peaks suggested by @mikuszefski in the comment.我尝试在评论中使用@mikuszefski建议的 find_peaks。 But it is also picking up all the small peaks (noises) as the image shows.但它也拾取了所有小峰值(噪声),如图所示。 在此处输入图像描述

Is there a way to choose the values only for the larger peaks?有没有办法只为较大的峰值选择值?

You really have to ("must", "are required to", "definitely under all circumstances") provide reasonable , finite, and plausible initial values for all the parameters.您确实必须(“必须”、“必须”、“绝对在所有情况下”)为所有参数提供合理的、有限的和似是而非的初始值。

When you say当你说

pars['g1_center'].set(value=105000, min=75000, max=125000)
pars['g1_sigma'].set(value=150000, min=30000)
pars['g1_amplitude'].set(value=2000000, min=100000)

gauss2 = GaussianModel(prefix='g2_')
pars.update(gauss2.make_params())

pars['g2_center'].set(value=155000, min=125000, max=175000)
pars['g2_sigma'].set(value=150000, min=30000)
pars['g2_amplitude'].set(value=2000000, min=100000)

You are (literally "literally") telling the program that Gaussian #1 should start with a center value of 105000, and cannot under any circumstance go beyond [75000, 125000].您(字面意思是“字面意思”)告诉程序高斯 #1 应该从中心值 105000 开始,并且在任何情况下都不能 go 超出 [75000, 125000]。

The data you provided and the plot shows that the two peaks you are interested in occur at x values of around 1.4 and 1.5.您提供的数据和 plot 显示您感兴趣的两个峰出现在大约 1.4 和 1.5 的x值处。

So, the value for the center is around 1 and you told it the value was around 10^5 and could not go below 75,000.因此,中心的值约为 1,而您告诉它该值约为 10^5,并且 go 不能低于 75,000。 That is the best fit under those constraints.这是在这些限制下的最佳选择。 The program worked without error or problem, and you got exactly what you asked for.该程序正常运行,没有错误或问题,并且您得到了您所要求的。

Again, for non-linear least-squares problems and curve-fitting, initial values always matter.同样,对于非线性最小二乘问题和曲线拟合,初始值总是很重要。 There are no situations in which they do not matter.没有任何情况下它们无关紧要。

That said, using a peak-finding algorithm as mikuszefski suggests is a fine choice.也就是说,使用 mikuszefski 建议的寻峰算法是一个不错的选择。

Aside: Bounds should be used primarily to constrain the logic/physics.旁白:边界应该主要用于约束逻辑/物理。 It might be reasonable to say the amplitude should be positive, for example.例如,可以合理地说幅度应该是正的。 There is nothing intrinsic about Gaussians in general (or, probably, your data) that demands that a centroid value of 74999 is non-sensical.一般来说,高斯(或者可能是您的数据)没有什么内在的要求质心值 74999 是不合理的。 So, do not start with such bounds.所以,不要从这样的界限开始。 Start without bounds and as simply as possible.开始没有界限,尽可能简单。 Add such complexity only when it is needed.仅在需要时才添加这种复杂性。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM