简体   繁体   中英

Moment matching - simulating a discrete distribution with specified moments (mean, standard deviation, skewness, kurtosis) in Python

Is there any library/function in Python which allows us to generate discrete data that matches given target moments (mean, standard deviation, skewness, kurtosis)? I do not wish to necessarily enforce any specific underlying continuous distribution.

That is, I want to generate, say, 10000 numbers, such that when we calculate their first four moments using standard formulae we get something close to the target moments given as input.

Any known library in Python that implements such method? Her is an example of a paper in which this specific problem is solved (as part of a larger problem):

https://link.springer.com/article/10.1023/A:1021853807313

Thanks!

Yes, although not with 100% accuracy, this is possible.

import statsmodels.sandbox.distributions.extras as extras
import scipy.interpolate as interpolate
import scipy.stats as ss
import matplotlib.pyplot as plt  
import numpy as np

def generate_normal_four_moments(mu, sigma, skew, kurt, size=10000, sd_wide=10):
   f = extras.pdf_mvsk([mu, sigma, skew, kurt])
   x = np.linspace(mu - sd_wide * sigma, mu + sd_wide * sigma, num=500)
   y = [f(i) for i in x]
   yy = np.cumsum(y) / np.sum(y)
   inv_cdf = interpolate.interp1d(yy, x, fill_value="extrapolate")
   rr = np.random.rand(size)

   return inv_cdf(rr)

Next, we generate the data by using

data = generate_normal_four_moments(mu=0, sigma=1, skew=-1, kurt=3)

Let's check the moments:

np.mean(data)
np.var(data)
ss.skew(data)
ss.kurtosis(data)

gives

-0.039986656405454374
 1.051375501684874
-1.071149838792561
 2.9813805363255472

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM