简体   繁体   中英

Check if numbers form bell curve (gauss distribution) Python 3

I've got files with irradiance data measured every minute 24 hours a day. So if there is a day without any clouds on the sky the data shows a nice continuous bell curves. When looking for a day without any clouds in the data I always plotted month after month with gnuplot and checked for nice bell curves.

I was wondering If there's a python way to check, if the Irradiance measurements form a continuos bell curve. Don't know if the question is too vague but I'm simply looking for some ideas on that quest :-)

For a normal distribution, there are normality tests .

In short, we abuse some knowledge we have of what normal distributions look like to identify them.

  • The kurtosis of any normal distribution is 3. Compute the kurtosis of your data and it should be close to 3.

  • The skewness of a normal distribution is zero, so your data should have a skewness close to zero

  • More generally, you could compute a reference distribution and use a Bregman Divergence , to assess the difference (divergence) between the distributions. bin your data, create a histogram, and start with Jensen-Shannon divergence.

With the divergence approach, you can compare to an arbitrary distribution. You might record a thousand sunny days and check if the divergence between the sunny day and your measured day is below some threshold.

Just to complement the given answer with a code example: one can use a Kolmogorov-Smirnov test to obtain a measure for the "distance" between two distributions. SciPy offers a neat interface for this, called kstest :

from scipy import stats
import numpy as np

data = np.random.normal(size=100)  # Our (synthetic) dataset
D, p = stats.kstest(data, "norm")  # Perform a one-sided Kolmogorov-Smirnov test

In the above example, D denotes the distance between our data and a Gaussian normal ( norm ) distribution (smaller is better), and p denotes the corresponding p-value. Other distributions can be similarly tested by substituting norm with those implemented in scipy.stats .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM