[英]Differences between numpy.random.rand vs numpy.random.randn in Python
What are the differences between numpy.random.rand
and numpy.random.randn
? numpy.random.rand
和numpy.random.randn
有什么区别?
From the documentation, I know the only difference between them is the probabilistic distribution each number is drawn from, but the overall structure (dimension) and data type used (float) is the same.从文档中,我知道它们之间的唯一区别是每个数字的概率分布,但整体结构(维度)和使用的数据类型(浮点数)是相同的。 I have a hard time debugging a neural network because of this.因此,我很难调试神经网络。
Specifically, I am trying to re-implement the Neural Network provided in the Neural Network and Deep Learning book by Michael Nielson .具体来说,我正在尝试重新实现Michael Nielson 在《神经网络和深度学习》一书中提供的神经网络。 The original code can be found here .原始代码可以在这里找到。 My implementation was the same as the original;我的实现和原来的一样; however, I instead defined and initialized weights and biases with numpy.random.rand
in the init
function, rather than the numpy.random.randn
function as shown in the original. however, I instead defined and initialized weights and biases with numpy.random.rand
in the init
function, rather than the numpy.random.randn
function as shown in the original.
However, my code that uses random.rand
to initialize weights and biases
does not work.但是,我使用random.rand
初始化weights and biases
的代码不起作用。 The network won't learn and the weights and biases will not change.网络不会学习,权重和偏差不会改变。
What is the difference(s) between the two random functions that cause this weirdness?导致这种怪异的两个随机函数之间有什么区别?
First, as you see from the documentation numpy.random.randn
generates samples from the normal distribution, while numpy.random.rand
from a uniform distribution (in the range [0,1)).首先,正如您从文档中看到的numpy.random.randn
从正态分布生成样本,而numpy.random.rand
从均匀分布(在 [0,1) 范围内)生成样本。
Second, why did the uniform distribution not work?第二,为什么均匀分布不起作用? The main reason is the activation function, especially in your case where you use the sigmoid function.主要原因是激活 function,尤其是在您使用 sigmoid function 的情况下。 The plot of the sigmoid looks like the following: sigmoid 的 plot 如下所示:
So you can see that if your input is away from 0, the slope of the function decreases quite fast and as a result you get a tiny gradient and tiny weight update.所以你可以看到,如果你的输入远离 0,function 的斜率下降得很快,结果你得到一个微小的梯度和微小的权重更新。 And if you have many layers - those gradients get multiplied many times in the back pass, so even "proper" gradients after multiplications become small and stop making any influence.如果你有很多层 - 这些梯度在回传中会被多次相乘,所以即使是“正确”的梯度在乘法之后也会变小并且不再产生任何影响。 So if you have a lot of weights which bring your input to those regions you network is hardly trainable.因此,如果您有很多权重将您的输入带到这些区域,那么您的网络很难训练。 That's why it is a usual practice to initialize network variables around zero value.这就是为什么通常的做法是在零值附近初始化网络变量。 This is done to ensure that you get reasonable gradients (close to 1) to train your net.这样做是为了确保您获得合理的梯度(接近 1)来训练您的网络。
However, uniform distribution is not something completely undesirable, you just need to make the range smaller and closer to zero.但是,均匀分布并不是完全不可取的,您只需要将范围缩小并接近零。 As one of good practices is using Xavier initialization.一种好的做法是使用 Xavier 初始化。 In this approach you can initialize your weights with:在这种方法中,您可以使用以下方法初始化权重:
Normal distribution.正态分布。 Where mean is 0 and var = sqrt(2. / (in + out))
, where in - is the number of inputs to the neurons and out - number of outputs.其中 mean 为 0 且var = sqrt(2. / (in + out))
,其中 in - 是神经元的输入数, out - 输出数。
Uniform distribution in range [-sqrt(6. / (in + out)), +sqrt(6. / (in + out))]
范围内的均匀分布[-sqrt(6. / (in + out)), +sqrt(6. / (in + out))]
np.random.rand
is for Uniform distribution (in the half-open interval [0.0, 1.0)
) np.random.rand
用于均匀分布(在半开区间[0.0, 1.0)
)np.random.randn
is for Standard Normal (aka. Gaussian) distribution (mean 0 and variance 1) np.random.randn
用于标准正态(又名高斯)分布(均值 0 和方差 1)You can visually explore the differences between these two very easily:您可以非常轻松地直观地探索这两者之间的差异:
import numpy as np
import matplotlib.pyplot as plt
sample_size = 100000
uniform = np.random.rand(sample_size)
normal = np.random.randn(sample_size)
pdf, bins, patches = plt.hist(uniform, bins=20, range=(0, 1), density=True)
plt.title('rand: uniform')
plt.show()
pdf, bins, patches = plt.hist(normal, bins=20, range=(-4, 4), density=True)
plt.title('randn: normal')
plt.show()
Which produce:哪个产品:
and和
1) numpy.random.rand
from uniform (in range [0,1)) 1) numpy.random.rand
来自统一(在 [0,1) 范围内)
2) numpy.random.randn
generates samples from the normal distribution 2) numpy.random.randn
从正态分布生成样本
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.