简体繁体 English

x的S形是1

[英]Sigmoid of x is 1

原文 2018-05-21 16:02:13 9 1 python/ neural-network/ sigmoid

I just read "Make your own neural network" book. 我刚刚读了“制作自己的神经网络”这本书。 Now I am trying to create NeuralNetwork class in Python. 现在，我试图在Python中创建NeuralNetwork类。 I use sigmoid activation function. 我使用S形激活功能。 I wrote basic code and tried to test it. 我编写了基本代码并尝试对其进行测试。 But my implementation didn't work properly at all. 但是我的实现根本无法正常工作。 After long debugging and comparation to code from book I found out that sigmoid of very big number is 1 because Python rounds it. 经过长时间的调试和与书中代码的比较，我发现很大的Sigmoid是1，因为Python将其四舍五入。 I use numpy.random.rand() to generate weights and this function returns only values from 0 to 1. After summing all products of weights and inputs I get very big number. 我使用numpy.random.rand()生成权重，此函数仅返回0到1的值。将权重和输入的所有乘积相加后，得到很大的数字。 I fixed this problem with numpy.random.normal() function that generates random numbers from range, (-1, 1) for example. 我使用numpy.random.normal()函数解决了此问题， numpy.random.normal()函数生成范围为（-1，1）的随机数。 But I have some questions. 但是我有一些疑问。

Is sigmoid good activation function? 乙状结肠功能好吗？
What to do if output of node is still so big and Python rounds result to 1, which is impossible for sigmoid? 如果node的输出仍然很大并且Python舍入为1，则该怎么办呢？
How can I prevent Python to rounding floats that are very close to integer 如何防止Python舍入非常接近整数的浮点数
Any advices for me as beginner in neural networks (books, techniques, etc). 对我作为神经网络初学者的任何建议（书籍，技术等）。

1 个解决方案

The answer to this question obviously depends on context. 这个问题的答案显然取决于上下文。 What it means by "good". “好”是什么意思。 The sigmoid activation function will result in outputs that are between 0 and 1. As such, they are standard output activations for binary classification where you want your neural network to output a number between 0 and 1 - with the output being interpreted as the probability of your input being in the specified class. 乙状结肠激活函数将导致输出介于0和1之间。因此，它们是二进制分类的标准输出激活，您希望神经网络输出介于0和1之间的数字-输出被解释为概率为您的输入在指定的类别中。 However, if you are using sigmoid activation functions throughout your neural network (ie in intermediate layers as well), you might consider switching to RELU activation function. 但是，如果您在整个神经网络中（也就是在中间层中）使用了S型激活函数，则可以考虑切换到RELU激活函数。 Historically, the sigmoid activation function was used throughout neural networks as a way to introduce non-linearity so that a neural network could do more than approximate linear functions. 从历史上看，在整个神经网络中都使用了S形激活函数，作为引入非线性的一种方式，因此神经网络可以做的不仅仅是近似线性函数。 However, it was found that sigmoid activations suffer heavily from the vanishing gradients problem because the function is so flat far from 0. As such, nowadays, most intermediate layers will use RELU activation functions (or something even more fancy - eg SELU/Leaky RELU/etc.) The RELU activation function is 0 for inputs less than 0 and equals the input for inputs greater than 0. Its been found to be sufficient for introducing non-linearity into a neural network. 但是，已发现乙形激活因梯度逐渐消失而遭受严重损失，因为该函数距离0太平坦了。因此，当今，大多数中间层将使用RELU激活函数（或更花哨的东西-例如SELU / Leaky RELU / etc。）对于小于0的输入，RELU激活函数为0，对于大于0的输入，等于输入。发现已足以将非线性引入神经网络。
Generally you don't want to be in a regime where your outputs are so huge or so small that it becomes computationally unstable. 通常，您不希望您的输出太大或太小而变得计算不稳定。 One way to help fix this issue, as mentioned earlier, is to use a different activation function (eg RELU). 如前所述，一种解决此问题的方法是使用其他激活功能（例如RELU）。 Another way, and perhaps even better way, to help with this issue is by initializing the weights better with eg the Xavior-Glorot initialization scheme or simply initializing them to smaller values eg within the range [-.01,.01]. 解决此问题的另一种方法，甚至可能是更好的方法，是通过使用Xavior-Glorot初始化方案更好地初始化权重，或者简单地将其初始化为较小的值，例如在[-.01，.01]范围内。 Basically, you scale the random initializations so that your outputs are in a good range of values and not some gigantic or miniscule number. 基本上，您可以缩放随机初始化，以便您的输出在适当的值范围内，而不是一些巨大或很小的数字。 You can certainly also do both. 当然，您也可以同时做这两项。
You can use higher precision floats to make python keep more decimals around. 您可以使用更高精度的浮点数来使python保留更多的小数位。 Eg you can use np.float64 instead of np.float32...however, this increases the computational complexity and probably isn't necessary. 例如，您可以使用np.float64而不是np.float32 ...但是，这会增加计算复杂性，可能不是必需的。 Most neural networks today use 32-bit floats and they work just fine. 如今，大多数神经网络都使用32位浮点数，并且工作正常。 See points 1 and 2 for better alternatives to solving your problem. 有关解决问题的更好选择，请参见第1点和第2点。
This question is overly broad. 这个问题过于笼统。 I would say that the coursera course and specialization by Prof. Andrew Ng is my strongest recommendation in terms of learning neural networks. 我想说，就学习神经网络而言，吴安德教授的课程和专业是我最强烈的建议。