简体   繁体   English

为给定的峰度或偏度生成数字(分布)

[英]Generating numbers (distribution) for a given Kurtosis or skewness

I am new to using Statistical functions in xls. 我是在xls中使用统计功能的新手。 I am able to the KURT function in xls to calculate the Kurtosis or Skewness, given a set of numbers. 给定一组数字,我能够使用xls中的KURT函数计算峰度或偏度。

But my requirement is to do it in the other way, like for a given Skewness or Kurtosis, is there a way to generate random numbers. 但是我的要求是以另一种方式进行操作,例如对于给定的偏度或峰度,是否可以生成随机数。 Any pointers on how to do that. 关于如何做到这一点的任何指示。

The function should take the skewness or Kurtosis value as input, and it should generate 50 random numbers with 1 being minimum and 100,000 being maximum. 该函数应将偏度或峰度值作为输入,并应生成50个随机数,其中最小为1,最大为100,000。

If Excel does not have a way, I am looking for suggestions in Python. 如果Excel没有办法,我正在Python中寻找建议。

Can you please help me how to do this in Excel or Python? 您能帮我在Excel或Python中执行此操作吗?

After experimenting with several distributions, the generalised Gamma distribution seems to be flexible enough to adjust either the skew or the kurtosis to the desired value, but not both at the same time like what was asked in the question @gabriel mentioned in his comment. 在尝试了几种分布之后, 广义的Gamma分布似乎足够灵活,可以将偏斜或峰度调整到所需的值,但不能像他的评论中提到的@gabriel 问题中的那样同时将两者调整。

So to draw a sample out of a g-Gamma distribution with a single fixed moment , you can use scipy.optimize to find a distribution with minimizes a penalty function (I chose (target - value) ** 2 ) 因此,要在单个固定时刻从g-Gamma分布中抽取样本,可以使用scipy.optimize查找具有最小罚函数的分布(我选择了(target - value) ** 2

from scipy import stats, optimize
import numpy as np

def random_by_moment(moment, value, size):
    """ Draw `size` samples out of a generalised Gamma distribution
    where a given moment has a given value """
    assert moment in 'mvsk', "'{}' invalid moment. Use 'm' for mean,"\
            "'v' for variance, 's' for skew and 'k' for kurtosis".format(moment)
    def gengamma_error(a):
        m, v, s, k = (stats.gengamma.stats(a[0], a[1], moments="mvsk"))
        moments = {'m': m, 'v': v, 's': s, 'k': k}
        return (moments[moment] - value) ** 2    # has its minimum at the desired value      

    a, c = optimize.minimize(gengamma_error, (1, 1)).x    
    return stats.gengamma.rvs(a, c, size=size)

n = random_by_moment('k', 3, 100000)
# test if result is correct
print("mean={}, var={}, skew={}, kurt={}".format(np.mean(n), np.var(n), stats.skew(n), stats.kurtosis(n)))

Before that I came up with a function that matches skew and kurtosis . 在此之前,我想出了一个可以匹配偏斜峰度的函数。 However even the g-Gamma is not flexible enough to serve this purpose depending on how extreme your conditions are 但是,即使g-Gamma也不够灵活,无法达到此目的,具体取决于您的条件是否极端

def random_by_sk(skew, kurt, size):
    def gengamma_error(a):
        s, k = (stats.gengamma.stats(a[0], a[1], moments="sk"))
        return (s - skew) ** 2 + (k - kurt) ** 2  # penalty equally weighted for skew and kurtosis

    a, c = optimize.minimize(gengamma_error, (1, 1)).x    
    return stats.gengamma.rvs(a, c, size=size)

n = random_by_sk(3, 3, 100000)
print("mean={}, var={}, skew={}, kurt={}".format(np.mean(n), np.var(n), stats.skew(n), stats.kurtosis(n)))
# will yield skew ~2 and kurtosis ~3 instead of 3, 3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM