简体繁体 English

神经网络隐藏层 vs. 卷积隐藏层直觉

[英]Neural network hidden layer vs. Convolutional hidden layer intuition

原文 2020-10-02 04:43:54 1 1 python/ tensorflow/ machine-learning/ deep-learning/ neural-network

If there are 10 features and 1 output class (sigmoid activation) with a regression objective:如果有 10 个特征和 1 个输出类（sigmoid 激活）具有回归目标：

If I use only 5 neurons in my first dense hidden layer: will the first error be calculated solely based on half of the training feature set?如果我在第一个密集隐藏层中只使用 5 个神经元：第一个错误是否仅基于训练特征集的一半计算？ Isn't it imperative to match the # of features with the neurons in hidden layer #1 so that the model can see all the features at once?是否必须将特征数量与隐藏层 #1 中的神经元匹配，以便模型可以一次看到所有特征？ Otherwise it's not getting the whole picture?否则它没有得到全貌？ The first fwd propagation iteration would use 5 out of 10 features, and get the error value (and train during backprop, assume batch grad descent).第一次 fwd 传播迭代将使用 10 个特征中的 5 个，并获得误差值（并在反向传播期间进行训练，假设批量梯度下降）。 Then the 2nd fwd propagation iteration would see the remaining 5 out of 10 features with updated weights and hopefully arrive at a smaller error.然后第二次 fwd 传播迭代将看到 10 个特征中剩余的 5 个具有更新的权重，并有望达到更小的误差。 BUT its only seeing half the features at a time!但它一次只能看到一半的功能！

Conversely, if I have a convolutional 2D layer of 64 neurons.相反，如果我有 64 个神经元的卷积 2D 层。 And my training shape is: (100, 28,28,1) (pictures of cats and dogs in greyscale), will each of the 64 neurons see a different 28x28 vector?我的训练形状是：(100, 28,28,1)（猫和狗的灰度图片），64 个神经元中的每一个都会看到不同的 28x28 向量吗？ No right, because it can only send one example through the forward propagation at a time?不对，因为它一次只能通过前向传播发送一个示例？ So then only a single picture (cat or dog) should be spanned across the 64 neurons?那么只有一张图片（猫或狗）应该跨越 64 个神经元？ Why would you want that since each neuron in that layer has the same filter, stride, padding and activation function?既然该层中的每个神经元都具有相同的过滤器、步长、填充和激活函数，为什么要这样做呢？ When you define a Conv2D layer...the parameters of each neuron are the same.当你定义一个 Conv2D 层时……每个神经元的参数都是一样的。 So is only a part of the training example going into each neuron?那么是否只有一部分训练示例进入每个神经元？ Why have 64 neurons, for example?例如，为什么有 64 个神经元？ Just have one neuron, use a filter on it and pass it along to a second hidden layer with another filter with different parameters!只需有一个神经元，在其上使用过滤器，然后将其传递到具有不同参数的另一个过滤器的第二个隐藏层！

Please explain the flaws in my logic.请解释我逻辑中的缺陷。 Thanks so much.非常感谢。

EDIT: I just realized for Conv2D, you flatten the training data sets so it becomes a 1D vector and so a 28x28 image would mean having an input conv2d layer of 724 neurons.编辑：我刚刚意识到对于 Conv2D，您将训练数据集展平，使其成为一维向量，因此 28x28 图像意味着具有 724 个神经元的输入 conv2d 层。 But I am still confused for the dense neural network (paragraph #1 above)但我仍然对密集神经网络感到困惑（上面的第 1 段）

1 个解决方案

What is your "first" layer?你的“第一”层是什么？ Normally you have an input layer as first layer, which does not contain any weights.通常你有一个输入层作为第一层，它不包含任何权重。 The shape of the input layer must match the shape of your feature data.输入图层的形状必须与您的特征数据的形状相匹配。 So basically when you train a model with 10 features, but only have a input layer of shape (None,5) (where none stands for the batch_size), tensorflow will raise an exception, because it needs data for all inputs in the correct shape.所以基本上当你训练一个有 10 个特征的模型，但只有一个形状为(None,5)的输入层（其中 none 代表 batch_size）时，tensorflow 会引发一个异常，因为它需要所有输入的数据都具有正确的形状.

So what you said is just not going to happen.所以你说的不会发生。 If you only have 5 features, the next 5 features wont be fit into the net in the next iteration but , the next sample will be send to the model instead.如果您只有 5 个特征，则接下来的 5 个特征将不会在下一次迭代中适合网络，但是，下一个样本将改为发送到模型。 (Lets say no exception is thrown) So of the next sample also only the first 5 features would be used. （假设没有抛出异常）因此，下一个示例也将仅使用前 5 个功能。

What you can do instead, use a input_layer as first layer with the correct shape of your features.您可以做的是，使用input_layer作为具有正确形状特征的第一层。 Then as secodn layer, you can use any shape you like, 1,10,100 dense neurons, its up to you (and what works well of course).然后作为第二层，你可以使用任何你喜欢的形状，1,10,100 个密集的神经元，这取决于你（当然还有什么效果很好）。 The shape of the output again must match (this time) the shape of your label data.输出的形状必须再次匹配（这次）标签数据的形状。

I hope this makes it more clear我希望这能让它更清楚