简体   繁体   English

numpy.histogram / numpy中的随机数的怪异行为?

[英]weird behavior of numpy.histogram / random numbers in numpy?

I stumbled upon some peculiar behavior of random numbers in Python , specifically I use the module numpy.random. 我偶然发现了Python中一些随机数的特殊行为,特别是我使用了numpy.random模块。

Consider the following expression: 考虑以下表达式:

n = 50
N = 1000
np.histogram(np.sum(np.random.randint(0, 2, size=(n, N)), axis=0), bins=n+1)[0]

In the limit of large N I would expect a binomial distribution (for the interested reader, this simulates the Ehrenfest model ) and for large n a normal distribution. 在大N的限制下,我期望二项式分布(对于感兴趣的读者,这模拟Ehrenfest模型 ),而对于大n ,则期望正态分布。 A typical output however, looks like this: 但是,典型的输出如下所示:

array([ 数组([
1, 0, 0, 1, 0, 2, 0, 1, 0, 15, 0, 1,0,0,1,0,2,0,1,0,15,0,
12, 0, 18, 0, 39, 0, 64, 0, 62, 0, 109, 12、0、18、0、39、0、64、0、62、0、109,
0, 97, 0, 107, 0, 114, 0, 102, 0, 92, 0, 0、97、0、107、0、114、0、102、0、92、0,
55, 0, 46, 0, 35, 0, 10, 0, 9, 0, 4, 55、0、46、0、35、0、10、0、9、0、4
0, 0, 0, 3, 0, 1, 1 0、0、0、3、0、1、1
]) ])

With the statement from above, I can't explain the occurrence of the zeros in the histogram - am I missing something obvious here? 有了上面的陈述,我无法解释直方图中零的出现-我在这里遗漏了明显的东西吗?

You're using histogram wrong. 您使用的histogram错误。 The bins aren't where you think they are. 垃圾箱不在您认为的位置。 They don't go from 0 to 50; 它们的取值范围不是0到50。 they go from the minimum input value to the maximum input value. 它们从最小输入值到最大输入值。 The 0s represent bins that lie entirely between two integers. 0代表完全位于两个整数之间的bin。

Try it with numpy.bincount : 使用numpy.bincount尝试一下:

In [31]: n = 50

In [32]: N = 5000

In [33]: np.bincount(np.sum(np.random.randint(0, 2, size=(n, N)), axis=0))
Out[33]: 
array([  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   7,  13,  22,  46,  75, 126, 220, 305, 367, 461, 550, 578,
       517, 471, 438, 314, 189, 146,  76,  50,  17,   9,   2,   1])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM