繁体   English   中英

如何生成整数的随机正态分布

[英]How to generate a random normal distribution of integers

如何使用np.random.randint()生成随机数 integer,但正态分布在 0 左右。

np.random.randint(-10, 10)返回具有离散均匀分布的整数np.random.normal(0, 0.1, 1)返回具有正态分布的浮点数

我想要的是两种功能之间的一种组合。

获得看起来像<\/em>正态分布的离散分布的另一种方法是从多项分布中提取,其中概率是根据正态分布计算的。

import scipy.stats as ss
import numpy as np
import matplotlib.pyplot as plt

x = np.arange(-10, 11)
xU, xL = x + 0.5, x - 0.5 
prob = ss.norm.cdf(xU, scale = 3) - ss.norm.cdf(xL, scale = 3)
prob = prob / prob.sum() # normalize the probabilities so their sum is 1
nums = np.random.choice(x, size = 10000, p = prob)
plt.hist(nums, bins = len(x))

可以从四舍五入的截断正态分布<\/em><\/a>生成类似的分布。 这是 scipy 的truncnorm()<\/a>的示例。

import numpy as np
from scipy.stats import truncnorm
import matplotlib.pyplot as plt

scale = 3.
range = 10
size = 100000

X = truncnorm(a=-range/scale, b=+range/scale, scale=scale).rvs(size=size)
X = X.round().astype(int)

这里接受的答案有效,但我尝试了 Will Vousden 的解决方案,它也很有效:

import numpy as np

# Generate Distribution:
randomNums = np.random.normal(scale=3, size=100000)
randomInts = np.round(randomNums)

# Plot:
axis = np.arange(start=min(randomInts), stop = max(randomInts) + 1)
plt.hist(randomInts, bins = axis)

看起来不错不是吗?

在这里,我们首先从钟形曲线<\/a>中获取值。

代码:

#--------*---------*---------*---------*---------*---------*---------*---------*
# Desc: Discretize a normal distribution centered at 0
#--------*---------*---------*---------*---------*---------*---------*---------*

import sys
import random
from math import sqrt, pi
import numpy as np
import matplotlib.pyplot as plt

def gaussian(x, var):
    k1 = np.power(x, 2)
    k2 = -k1/(2*var)
    return (1./(sqrt(2. * pi * var))) * np.exp(k2)

#--------*---------*---------*---------*---------*---------*---------*---------#
while 1:#                          M A I N L I N E                             #
#--------*---------*---------*---------*---------*---------*---------*---------#
#                                  # probability density function
#                                  #   for discrete normal RV
    pdf_DGV = []
    pdf_DGW = []    
    var = 9
    tot = 0    
#                                  # create 'rough' gaussian
    for i in range(-var - 1, var + 2):
        if i ==  -var - 1:
            r_pdf = + gaussian(i, 9) + gaussian(i - 1, 9) + gaussian(i - 2, 9)
        elif i == var + 1:
            r_pdf = + gaussian(i, 9) + gaussian(i + 1, 9) + gaussian(i + 2, 9)
        else:
            r_pdf = gaussian(i, 9)
        tot = tot + r_pdf
        pdf_DGV.append(i)
        pdf_DGW.append(r_pdf)
        print(i, r_pdf)
#                                  # amusing how close tot is to 1!
    print('\nRough total = ', tot)
#                                  # no need to normalize with Python 3.6,
#                                  #   but can't help ourselves
    for i in range(0,len(pdf_DGW)):
        pdf_DGW[i] = pdf_DGW[i]/tot
#                                  # print out pdf weights
#                                  #   for out discrte gaussian
    print('\npdf:\n')
    print(pdf_DGW)

#                                  # plot random variable action
    rv_samples = random.choices(pdf_DGV, pdf_DGW, k=10000)
    plt.hist(rv_samples, bins = 100)
    plt.show()
    sys.exit()

这个版本在数学上是不正确的(因为你剪掉了铃铛),但如果不需要那么精确,它会快速且容易理解地完成这项工作:

def draw_random_normal_int(low:int, high:int):

    # generate a random normal number (float)
    normal = np.random.normal(loc=0, scale=1, size=1)

    # clip to -3, 3 (where the bell with mean 0 and std 1 is very close to zero
    normal = -3 if normal < -3 else normal
    normal = 3 if normal > 3 else normal

    # scale range of 6 (-3..3) to range of low-high
    scaling_factor = (high-low) / 6
    normal_scaled = normal * scaling_factor

    # center around mean of range of low high
    normal_scaled += low + (high-low)/2

    # then round and return
    return np.round(normal_scaled)

老问题,新答案:

对于整数 {-10, -9, ..., 9, 10} 的钟形分布,您可以使用 n=20 和 p=0.5 的二项式分布<\/a>,并从样本中减去 10。

例如,

In [167]: import numpy as np

In [168]: import matplotlib.pyplot as plt

In [169]: N = 5000000   # Number of samples to generate

In [170]: samples = rng.binomial(n=20, p=0.5, size=N) - 10

In [171]: samples.min(), samples.max()
Out[171]: (-10, 10)

我不确定是否有(在 scipy 生成器中)要生成的 var 类型选项,但是常见的生成可以是 scipy.stats

# Generate pseudodata  from a single normal distribution
import scipy
from scipy import stats
import numpy as np
import matplotlib.pyplot as plt

dist_mean = 0.0
dist_std = 0.5
n_events = 500

toy_data = scipy.stats.norm.rvs(dist_mean, dist_std, size=n_events)
toy_data2 = [[i, j] for i, j in enumerate(toy_data )]

arr = np.array(toy_data2)
print("sample:\n", arr[1:500, 0])
print("bin:\n",arr[1:500, 1])
plt.scatter(arr[1:501, 1], arr[1:501, 0])
plt.xlabel("bin")
plt.ylabel("sample")
plt.show()

或者以这种方式(也没有 dtype 选择的选项):

import matplotlib.pyplot as plt

mu, sigma = 0, 0.1 # mean and standard deviation
s = np.random.normal(mu, sigma, 500)

count, bins, ignored = plt.hist(s, 30, density=True)
plt.show()
print(bins)     # <<<<<<<<<<

plt.plot(bins, 1/(sigma * np.sqrt(2 * np.pi)) * np.exp( - (bins - mu)**2 / (2 * sigma**2) ),
          linewidth=2, color='r')
plt.show()

没有可视化最常见的方式(也没有可能指出 var 类型)

bins = np.random.normal(3, 2.5, size=(10, 1))

可以完成一个包装类来实例化具有给定 vars-dtype 的容器(例如,通过将浮点数舍入为整数,如上所述)...

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM