简体   繁体   English

如何通过使用运行中值来提高Python 3.X中的高斯/法线拟合?

[英]How do I improve a Gaussian/Normal fit in Python 3.X by using a running median?

I have an array of 100x100 data points, where I'm trying to perform a Gaussian fit to each column of 100 values in the array. 我有一个100x100数据点的数组,在这里我试图对数组中100个值的每一列执行高斯拟合。 I then want the parameters of the Gaussian found by using the fit of the first column to be the initial parameters of the starting point for the next column to use. 然后,我希望通过使用第一列的拟合值找到的高斯参数作为下一列要使用的起点的初始参数。 Let's say I start with the initial parameters of 1000, 0, and 1, and the fit finds values of 800, 3, and 1.5. 假设我从初始参数1000、0和1开始,而拟合得到的值分别为800、3和1.5。 I then want the fitter to use these three parameters as initial values for the next column. 然后,我希望装配工将这三个参数用作下一列的初始值。

My code is: 我的代码是:

x = np.linspace(-50,50,100)
Gauss_Model = models.Gaussian1D(amplitude = 1000., mean = 0, stddev = 1.)
Fitting_Model = fitting.LevMarLSQFitter()

Fit_Data = []

for i in range(0, Data_Array.shape[0]):
    Fit_Data.append(Fitting_Model(Gauss_Model, x, Data_Array[:,i]))

Right now it uses the same initial values for every fit. 现在,每次拟合都使用相同的初始值。 Does anyone know how to perform such a running median/mean for a Gaussian fitting method? 有谁知道如何对高斯拟合方法执行这样的连续中值/均值? Would really appreciate any help or being pointed in the right direction, thanks! 非常感谢您的帮助或指出正确的方向,谢谢!

I'm not familiar with the specific library you are using, but if you can get your fitted parameters out with something like fit_data[-1].amplitude or fit_data[-1].mean , then you could modify your loop to use something like: 我对您使用的特定库不熟悉,但是如果可以使用fit_data[-1].amplitudefit_data[-1].mean fit_data[-1].amplitude类的参数来拟合参数,则可以修改循环以使用某些东西喜欢:

for i in range(0, data_array.shape[0]):
    if fit_data:  # true if not an empty list
        Gauss_Model = models.Gaussian1D(amplitude=fit_data[-1].amplitude,
                                        mean=fit_data[-1].mean,
                                        stddev=fit_data[-1].stddev)
    fit_data.append(Fitting_Model(Gauss_Model, x, Data_Array[:,i]))

basically checking whether you have already fit a model, and if you have, use the most recent fitted amplitude, mean, and standard deviation as the starting point for your next Gauss_Model . 基本上检查您是否已经拟合模型,如果已经拟合,则使用最新拟合的幅度,均值和标准差作为下一个Gauss_Model的起点。

A thought: this might speed up your fitting, but it shouldn't result in a "better" fit to the 100 data points in each fit operation. 一个想法:这可能会加快拟合速度,但是在每次拟合操作中都不应该“更好”地拟合100个数据点。 Your resulting model is probably the best fit model to the data it was presented. 生成的模型可能是适合所显示数据的模型。 If you want to estimate the error in the parameters of your model, you can use the fact that, for two normal distributions A ~ N(m_a, v_a) and B ~ N(m_b, v_b) , the distribution A + B will have mean m_a + m_b and variance is v_a + v_b . 如果要估计模型参数中的误差,则可以使用以下事实:对于两个正态分布A ~ N(m_a, v_a)B ~ N(m_b, v_b) ,分布A + B将具有均值m_a + m_b ,方差为v_a + v_b Thus, the distribution of your means will be N(sum(means)/n, sum(variances)/n) . 因此,您的均值分布将为N(sum(means)/n, sum(variances)/n) Basically you can say that your true mean is centered at the mean of your means with standard deviation (sum(stddev)/sqrt(n)) . 基本上可以说,您的真实均值以具有标准偏差(sum(stddev)/sqrt(n))均值为中心。

I also cannot tell what library you are using, and the details of how to do this probably depend on the details of how that library stores the fitted values. 我也无法确定您正在使用哪个库,如何执行此操作的详细信息可能取决于该库如何存储拟合值。 I can say that for lmfit ( https://lmfit.github.io/lmfit-py/ ) we struggled with this sort of usage and arrived at a design that makes what you are trying to do pretty easy. 我可以说,对于lmfit( https://lmfit.github.io/lmfit-py/ ),我们在这种用法上苦苦挣扎,并得出了一种使您尝试做的事情很容易的设计。 With lmfit, you might compose this problem as: 使用lmfit,您可能会将此问题归结为:

import numpy as np
from lmfit import GaussianModel

x = np.linspace(-50,50,100)
# get Data_Array from somewhere....

# create a model for a Gaussian
Gauss_Model = GaussianModel()

# make a set of parameters, setting initial values
params = Gauss_Model.make_params(amplitude=1000, center=0, sigma=1.0)

Fit_Results = []

for i in range(Data_Array.shape[1]):
    result = Gauss_Model.fit(Data_Array[:, i], params, x=x)
    Fit_Results.append(result)
    # update `params` with the current best fit params for the next column
    params = result.params

Note that this works because lmfit is careful that Model.fit() will not alter the input parameters, and will put the resulting best-fit parameters for each fit in result.params . 请注意,这是可行的,因为lmfit小心Model.fit()不会更改输入参数,并将每次拟合的结果最佳拟合参数放入result.params

And, if you decide you do want to have all columns use the original initial values, just comment out that last params = result.params . 而且,如果您确定要让所有列都使用原始初始值,则只需注释掉最后一个params = result.params

Lmfit has a lot more bells and whistles, but I hope that helps you do what you need. Lmfit有更多的花招,但我希望可以帮助您完成所需的工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM