[英]How can I find the right gaussian curve given some data?
I have code that draws from a gaussian in 1D: 我有从一维高斯绘制的代码:
import numpy as np
from scipy.stats import norm
from scipy.optimize import curve_fit
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt
import gauss
# Beginning in one dimension:
mean = 0; Var = 1; N = 1000
scatter = np.random.normal(mean,np.sqrt(Var),N)
scatter = np.sort(scatter)
mu,sigma = norm.fit(scatter)
I obtain mu and sigma using norm.fit() 我使用norm.fit()获得mu和sigma
Now I'd like to obtain my parameters using 现在我想使用
xdata = np.linspace(-5,5,N)
pop, pcov = curve_fit(gauss.gauss_1d,xdata,scatter)
The problem is I don't know how to map my scattered points (drawn from a 1D gaussian) to the x-line in order to use curve_fit. 问题是我不知道如何将散点(从1D高斯绘制)映射到X线,以便使用curve_fit。
Also, suppose I simply use and mu and sigma as earlier. 另外,假设我像之前一样简单地使用mu和sigma。
I plot using: 我使用以下方式绘图
n, bins, patches = plt.hist(scatter,50,facecolor='green')
y = 2*max(n)*mlab.normpdf(bins,mu,sigma)
l = plt.plot(bins,y,'r--')
plt.xlabel('x-coord')
plt.ylabel('Occurrences')
plt.grid(True)
plt.show()
But I have to guess the amplitude as 2*max(n). 但是我不得不猜测振幅为2 * max(n)。 It works but it's not robust.
它可以工作,但不可靠。 How can I find the amplitude without guessing?
我如何才能找到振幅而无需猜测?
To avoid guessing the amplitude, call hist()
with normed=True
, then the amplitude corresponds to normpdf()
. 为了避免猜测振幅,请使用
normpdf()
normed=True
调用hist()
,然后振幅对应于normpdf()
。
For doing a curve fit, I suggest to use not the density but the cumulative distribution: Each sample has a height of 1/N
, which successively sum up to 1. This has the advantage that you don't need to group samples in bins. 为了进行曲线拟合,我建议不要使用密度,而应使用累积分布:每个样本的高度为
1/N
,该高度依次相加为1。这具有不需要将样本分组到bin中的优点。 。
import numpy as np
from scipy.stats import norm
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt
# Beginning in one dimension:
mean = 0; Var = 1; N = 100
scatter = np.random.normal(mean,np.sqrt(Var),N)
scatter = np.sort(scatter)
mu1,sigma1 = norm.fit(scatter) # classical fit
scat_sum = np.cumsum(np.ones(scatter.shape))/N # cumulative samples
[mu2,sigma2],Cx = curve_fit(norm.cdf, scatter, scat_sum, p0=[0,1]) # curve fit
print(u"norm.fit(): µ1= {:+.4f}, σ1={:.4f}".format(mu1, sigma1))
print(u"curve_fit(): µ2= {:+.4f}, σ2={:.4f}".format(mu2, sigma2))
fg = plt.figure(1); fg.clf()
ax = fg.add_subplot(1, 1, 1)
t = np.linspace(-4,4, 1000)
ax.plot(t, norm.cdf(t, mu1, sigma1), alpha=.5, label="norm.fit()")
ax.plot(t, norm.cdf(t, mu2, sigma2), alpha=.5, label="curve_fit()")
ax.step(scatter, scat_sum, 'x-', where='post', alpha=.5, label="Samples")
ax.legend(loc="best")
ax.grid(True)
ax.set_xlabel("$x$")
ax.set_ylabel("Cumulative Probability Density")
ax.set_title("Fit to Normal Distribution")
fg.canvas.draw()
plt.show()
prints 版画
norm.fit(): µ1= +0.1534, σ1=1.0203
curve_fit(): µ2= +0.1135, σ2=1.0444
and plots 和情节
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.