简体   繁体   English

如何确定神经网络中的激活函数

[英]How to decide activation function in neural network

I am using feedforward, backpropagation, multilayer neural network and I am using sigmoid function as a activation function which is having range of -1 to 1. But the minimum error is not going below 5.8 and I want so less, you can see the output after 100000 iterations. 我正在使用前馈,反向传播,多层神经网络,并且正在使用sigmoid函数作为激活函数,其范围为-1到1。但是最小误差不会低于5.8,我想要的是最小误差,您可以看到输出经过100000次迭代。 NN中的迭代误差图

I think this is because of my output range is above 1, and sigmoid functions range is only -1 to 1. Can anybody suggest me how i can overcome this problem as my desired output range is 0 to 2.5. 我认为这是因为我的输出范围是1以上,而S型函数的范围是-1到1。有人可以建议我如何克服这个问题,因为我希望的输出范围是0到2.5。 Suggest me which activation function will be best for this range. 建议我哪种激活功能最适合此范围。

The vanilla sigmoid function is: 香草乙状结肠功能是:

def sigmoid(x):
    return 1/(1+math.e**-x)

You could transform that to: 您可以将其转换为:

def mySigmoid(x):
    return 2.5/(1+math.e**-x)

in order to make the transformation that you want 为了做出你想要的转变

If you are seeking to reduce output error, there are a couple of things to look at before tweaking a node's activation function. 如果要减少输出错误,请在调整节点的激活功能之前要注意几件事。

First, do you have a bias node? 首先,您有一个偏置节点吗? Bias nodes have several implications, but - most relevant to this discussion - they allow the network output to be translated to the desired output range. 偏置节点有几个含义,但是-与本讨论最相关-它们允许将网络输出转换为所需的输出范围。 As this reference states: 本参考资料所述:

The use of biases in a neural network increases the capacity of the network to solve problems by allowing the hyperplanes that separate individual classes to be offset for superior positioning. 在神经网络中使用偏置可以提高网络解决问题的能力,方法是允许将分隔单个类别的超平面偏移以实现优越的定位。

This post provides a very good discussion: Role of Bias in Neural Networks . 这篇文章提供了很好的讨论: 偏差在神经网络中的作用 This one is good, too: Why the BIAS is necessary in ANN? 这也很好: 为什么在人工神经网络中必须使用BIAS? Should we have separate BIAS for each layer? 我们是否应该为每个层分别设置BIAS?

Second method: it often helps to normalize your inputs and outputs. 第二种方法:它通常有助于规范您的输入和输出。 As you note, your sigmoid offers a range of +/- 1. This small range can be problematic when trying to learn functions that have a range of 0 to 1000 (for example). 正如您所注意到的,您的S形曲线的范围为+/-1。例如,当尝试学习范围为0到1000的函数时,此较小的范围可能会出现问题。 To aid learning, it's common to scale and translate inputs to accommodate the node activation functions. 为了帮助学习,通常会缩放和转换输入以适应节点激活功能。 In this example, one might divide the range by 500, yielding a 0 to 2 range, and then subtract 1 from this range. 在此示例中,可以将范围除以500,得出0到2的范围,然后从该范围中减去1。 In this manner, the inputs have been normalized to a range of -1 to 1, which better fits the activation function. 以这种方式,输入已被标准化为-1到1的范围,这更适合激活函数。 Note that network output should be denormalized: first, add +1 to the output, then multiply by 500. 请注意,应该对网络输出进行规范化:首先,将+1添加到输出中,然后乘以500。

In your case, you might consider scaling the inputs by 0.8, then subtracting 1 from the result. 在您的情况下,您可以考虑将输入缩放0.8,然后从结果中减去1。 You would then add 1 to the network output, and then multiply by 1.25 to recover the desired range. 然后,您将1添加到网络输出,然后乘以1.25以恢复所需的范围。 Note that this method may be easiest to accomplish since it does not directly change your network topology like the addition of bias would. 请注意,此方法最容易实现,因为它不会像添加偏见那样直接更改您的网络拓扑。

Finally, have you experimented with changing the number of hidden nodes? 最后,您是否尝试过更改隐藏节点的数量? Although I believe the first two options are better candidates for improving performance, you might give this one a try. 尽管我认为前两种选择是提高性能的更好选择,但您可以尝试一下。 (Just as a point of reference, I can't recall an instance in which modifying the activation function's shape improved network response more than option 1 and 2.) (仅作为参考,我想不起来一个实例,在该实例中,修改激活函数的形状比选择1和2更能改善网络响应。)

Here are some good discussion of hidden layer/node configuration: multi-layer perceptron (MLP) architecture: criteria for choosing number of hidden layers and size of the hidden layer? 以下是有关隐藏层/节点配置的一些很好的讨论: 多层感知器(MLP)体系结构:选择隐藏层数和隐藏层大小的标准? How to choose number of hidden layers and nodes in neural network? 如何选择神经网络中隐藏层和节点的数量?

24 inputs make your problem a high-dimensional one. 24个输入使您的问题成为高维输入。 Ensure that your training dataset adequately covers the input state space, and ensure that you are your test data and training data are drawn from similarly representative populations. 确保您的训练数据集充分覆盖了输入状态空间,并确保您是测试数据,并且训练数据来自具有代表性的总体。 (Take a look at the "cross-validation" discussions when training neural networks). (在训练神经网络时,请看一下“交叉验证”讨论)。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在卷积神经网络中使用余弦函数作为激活函数? - How to Use Cosine Function as Activation Function in Convolutional Neural Network? 如何在神经网络中使用Softmax激活功能 - How to use Softmax Activation function within a Neural Network 使用 sigmoid 激活函数理解神经网络输出 >1 - Understanding neural network output >1 with sigmoid activation function 从神经网络的不同成本函数和激活函数中选择 - Choosing from different cost function and activation function of a neural network 在卷积神经网络中,如何使用 Maxout 而不是 ReLU 作为激活函数? - In a convolutional neural network, how do I use Maxout instead of ReLU as an activation function? 在神经网络的体系结构中何处放置激活函数? - Where to place an activation function(s) in the architecture of a neural network? 有没有一种方法可以向神经网络输出添加约束,但仍然具有softmax激活功能? - Is there a way to add constraints to a neural network output but still with softmax activation function? PyTorch-如何设置神经元的激活规则以提高神经网络的效率? - PyTorch - How to set Activation Rules of neurons to increase efficiency of Neural Network? 如何在Tensorflow的神经网络层中实现不同的激活功能? - How to implement different activation functions in a layer of a neural network in Tensorflow? 超参数调整以决定最佳神经网络 - Hyperparameter tuning to decide optimal neural network
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM