简体   繁体   中英

Fitting a bimodal distribution to a set of values

Given a 1D array of values, what is the simplest way to figure out what the best fit bimodal distribution to it is, where each 'mode' is a normal distribution? Or in other words, how can you find the combination of two normal distributions that bests reproduces the 1D array of values?

Specifically, I'm interested in implementing this in python, but answers don't have to be language specific.

Thanks!

What you are trying to do is called a Gaussian Mixture model. The standard approach to solving this is using Expectation Maximization, scipy svn includes a section on machine learning and em called scikits . I use it aa fair bit.

I suggest using the awesome scipy package. It provides a few methods for optimisation.

There's a big fat caveat with simply applying a pre-defined least square fit or something along those lines.

Here are a few problems you will run into:

  1. Noise larger than second/both peaks.
  2. Partial peak - your data is cut of at one of the borders.
  3. Sampling - width of peaks are smaller than your sampled data.
  4. It isn't normal - you'll get some result ...
  5. Overlap - If peaks overlap you'll find that often one peak is fitted correctly but the second will apporach zero...

I am just trying to understand why one needs to fit a bimodal distributing for a 1D array? What are the advantages of doing this?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM