在python sklearn中使用高斯混合为1D数组

Question

I would like to use a Gaussian mixture model to return something like the image below except proper Gaussians. 我想使用高斯混合模型返回下面的图像，除了适当的高斯。

I'm attempting to use python sklearn.mixture.GaussianMixture but I have failed. 我正在尝试使用python sklearn.mixture.GaussianMixture但我失败了。 I can treat each peak as though it were the height of a histogram for any given x value. 我可以将每个峰视为任何给定x值的直方图的高度。 My question is: do I have to find a way to transform this graph into a histogram and remove the negative values, or is there a way to apply GMM directly onto this array to produce the red and green gaussians? 我的问题是：我是否必须找到一种方法将此图形转换为直方图并删除负值，或者是否有办法将GMM直接应用于此数组以生成红色和绿色高斯？

Answer 1

There is a difference between fitting a curve to pass through a set of points using a Gaussian curve and modeling a probability distribution of some data using GMM. 使用高斯曲线拟合曲线以通过一组点并使用GMM对一些数据的概率分布建模之间存在差异。

When you use GMM you are doing the later, and it won't work. 当你使用GMM时，你正在做更晚的事情，它将无法正常工作。

If you apply GMM using only the variable on the Y axis you will get a Gaussian distribution of Y that does not take into account the X variable. 如果仅使用Y轴上的变量应用GMM，则会得到Y的高斯分布，而不考虑X变量。
If you apply GMM using 2 variables you will get bi dimensional Gaussians that won't be of any help for your problem. 如果您使用2个变量应用GMM，您将获得双维高斯，这对您的问题没有任何帮助。

Now if what you want is to fit a Gaussian curve . 现在，如果您想要的是拟合高斯曲线 。 Try the answer to this question . 试试这个问题的答案。

import numpy
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt

# Define some test data which is close to Gaussian
data = numpy.random.normal(size=10000)

hist, bin_edges = numpy.histogram(data, density=True)
bin_centres = (bin_edges[:-1] + bin_edges[1:])/2

# Define model function to be used to fit to the data above:
# Adapt it to as many gaussians you may want
# by copying the function with different A2,mu2,sigma2 parameters
def gauss(x, *p):
    A, mu, sigma = p
    return A*numpy.exp(-(x-mu)**2/(2.*sigma**2))

# p0 is the initial guess for the fitting coefficients (A, mu and sigma above)
p0 = [1., 0., 1.]

coeff, var_matrix = curve_fit(gauss, bin_centres, hist, p0=p0)

# Get the fitted curve
hist_fit = gauss(bin_centres, *coeff)

plt.plot(bin_centres, hist, label='Test data')
plt.plot(bin_centres, hist_fit, label='Fitted data')

# Finally, lets get the fitting parameters, i.e. the mean and standard deviation:
print 'Fitted mean = ', coeff[1]
print 'Fitted standard deviation = ', coeff[2]

plt.show()

Update on how to adapt the code for multiple gaussians: 更新如何调整多个高斯的代码：

def gauss2(x, *p):
    A1, mu1, sigma1, A2, mu2, sigma2 = p
    return A1*numpy.exp(-(x-mu1)**2/(2.*sigma1**2)) + A2*numpy.exp(-(x-mu2)**2/(2.*sigma2**2))

# p0 is the initial guess for the fitting coefficients initialize them differently so the optimization algorithm works better
p0 = [1., -1., 1.,1., -1., 1.]

#optimize and in the end you will have 6 coeff (3 for each gaussian)
coeff, var_matrix = curve_fit(gauss, X_data, y_data, p0=p0)

#you can plot each gaussian separately using 
pg1 = coeff[0:3]
pg2 = coeff[3:]

g1 = gauss(X_data, *pg1)
g2 = gauss(X_data, *pg2)

plt.plot(X_data, y_data, label='Data')
plt.plot(X_data, g1, label='Gaussian1')
plt.plot(X_data, g2, label='Gaussian2')

在python sklearn中使用高斯混合为1D数组

问题描述

1 个解决方案

解决方案1
4 2018-08-03 07:16:18

在python sklearn中使用高斯混合为1D数组

问题描述

1 个解决方案

解决方案1 4 2018-08-03 07:16:18

解决方案1
4 2018-08-03 07:16:18