简体   繁体   English

关于神经网络激活功能的困惑

[英]Confusion about neural networks activation functions

I followed a tutorial about an image classifier using Python and Tensorflow. 我遵循了有关使用Python和Tensorflow进行图像分类的教程。

I'm now trying to apply deep learning to a custom situation. 我现在正尝试将深度学习应用于自定义情况。 I made a simulation program of sellers/buyers where the customers buy a stone following its wishes. 我制作了一个卖方/购买者的模拟程序,客户按照其意愿购买了一块石头。 The stones have a color, a size and a percentage of curve. 宝石具有颜色,大小和曲线百分比。 The nearest of the customer wished values the stone is, the more the customer is able to pay. 客户最希望的价格是石头,客户支付的金额就更多。 For the seller, the rarest the stone is, the higher the price should be. 对于卖方来说,最稀有的宝石价格应该更高。 Then the program generates 100.000 purchases of a stone to feed a neural network which will try to beat others sellers. 然后,该程序生成了100.000块购买的石头,以提供神经网络,这将试图击败其他卖家。 The dataset is looking like that : 数据集看起来像这样:

数据集

I'm now trying to create my neural network. 我现在正在尝试创建我的神经网络。 In the tutorial, he is using two Conv2D layers with a relu activation function and a MaxPooling2D, then a Flatten layer, a Dense layer and finally another Dense layer with a sigmoid activation function. 在本教程中,他将使用两个具有relu激活功能和MaxPooling2D的Conv2D层,然后是Flatten层,Dense层,最后是另一个具有Sigmoid激活功能的Dense层。

After reading some documentation, I found that the Conv2D layer is for a matrix but my data is already flat, so I prefer to use only Dense layers. 阅读一些文档后,我发现Conv2D图层用于矩阵,但是我的数据已经平坦,因此我更喜欢仅使用Dense图层。

My first question is : does my neural network need a dense layer with a relu function like that : 我的第一个问题是:我的神经网络是否需要一个具有relu函数的密集层,例如:

model.add(Dense(64, activation='relu', input_dim(3)))

If my program generates only positives values ? 如果我的程序仅生成正值?

My second question is : does my neural network need a sigmoid function if I already normalized my data to make them between 0 and 1 by dividing them like this ? 我的第二个问题是:如果我已经将数据归一化(通过这样划分它们使它们介于0和1之间),那么我的神经网络是否需要S型函数? :

X[:,0] /= 256.0
X[:,1] /= 50.0
X[:,2] /= 100.0

These values are the max value of each column. 这些值是每列的最大值。 So do I need a sigmoid function ? 那我需要一个S型函数吗?

Actually my neural network looks like this : 实际上我的神经网络看起来像这样:

model = Sequential()
model.add(Dense(64, activation='relu', input_dim(3)))
model.add(Dense(64, activation='relu'))
model.add(Dense(1,  activation='sigmoid'))

But I'm confused about the efficientness of my model. 但是我对模型的效率感到困惑。 Does my neural network could work ? 我的神经网络可以工作吗? If not, what kind of layers and activation functions I have to use ? 如果没有,我必须使用哪种层和激活功能?

My first question is : does my neural network need a dense layer with a relu function like that : 我的第一个问题是:我的神经网络是否需要一个具有relu函数的密集层,例如:

Yes. 是。 Your network requires the ReLUs even if your data is only positive. 即使您的数据只是正数,您的网络也需要ReLU。 The idea of ReLUs (and activation functions in general) is that they add a certain complexity, such that the classifier may learn to generalize. ReLU(通常是激活函数)的想法是,它们会增加一定的复杂度,以便分类器可以学习概括。

Consider a CNN that takes images as inputs. 考虑将图像作为输入的CNN。 The input data here consists of only positive values as well ( [0-1] or [0-255]) and they usually have many and many layers with the ReLU nonlinearity. 这里的输入数据也仅包含正值([0-1]或[0-255]),并且它们通常具有ReLU非线性的许多层。

If my program generates only positives values ? 如果我的程序仅生成正值?

Your confusion is that your actual input-output relationship produces only positive values, but your classifier still contains weights that can be negative, so your layer outputs could still be negative otherwise. 您的困惑是您的实际输入输出关系仅产生正值,但是分类器仍然包含可以为负的权重,因此,否则图层输出仍可能为负。

Also, if you were not to have any nonlinearities like ReLU, there would be no point in having multiple layers, as they would add no complexity to your classifier. 同样,如果您没有像ReLU这样的任何非线性,那么拥有多层就没有意义了,因为它们不会给您的分类器增加复杂性。

second question is : does my neural network need a sigmoid function if I already normalized my data to make them between 0 and 1 by dividing them like this? 第二个问题是:如果我已经通过这样划分数据使数据归一化以使其介于0和1之间,我的神经网络是否需要S型函数?

Yes. 是。 You also need the sigmoid. 您还需要S型。 Same reasoning as above. 与上述相同的推理。 Your data may be positive, but your output layer would still be able to produce negative values or otherwise values outside your expected range. 您的数据可能是正数,但您的输出层仍将能够生成负值或其他超出预期范围的值。

Having a linear output activation function would make learning nearly impossible, especially if your output range is within [0,1]. 具有线性输出激活功能将使学习变得几乎不可能,特别是如果您的输出范围在[0,1]之内。

it is fine if you normalize like that (/ max), also you could use sigmoid on input but that will be sort of less accurate on max and min values. 如果您像这样(/ max)进行标准化,那很好,您也可以在输入中使用Sigmoid,但是在最大和最小值上精度较低。 But i dont understand why do you use Conv2D layer since it is full connected with only 4 input. 但是我不明白为什么您使用Conv2D图层,因为它仅用4个输入就完全连接了。 Also if you generate dataset full random, the network will not learn anything 另外,如果您生成完全随机的数据集,则网络将不会学到任何东西

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM