简体   繁体   English

CNN(Python Keras)的训练过程

[英]Training process of a CNN (Python Keras)

Consider the following architecture for a CNN, ( code fragment was referred from this link ) 考虑CNN的以下架构,( 此链接引用了代码片段

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

My questions are basically about the training process of a CNN. 我的问题基本上是关于CNN的培训过程。

  1. When you train the model, do the outputs of the Flatten layer change during the epochs? 训练模型时,展平层的输出在各个时期是否发生变化?
  2. If the outputs (of Flatten layer) change, does that mean there is a backpropagation process before the Flatten layer (between, Conv2d->Conv2D->MaxPooling2D->Flatten) as well? 如果(Flatten层的)输出发生变化,是否表示在Flatten层(在Conv2d-> Conv2D-> MaxPooling2D-> Flatten之间)之前还有反向传播过程?
  3. What is the necessity of using a Dropout after the MaxPooling2D layer (or any layer before flatten)? 在MaxPooling2D层(或展平前的任何层)之后使用Dropout有什么必要?
  1. The flatten layer simply takes the output of the previous layer and flattens everything into one long vector instead of keeping it as a multidimensional array. 展平层只是将上一层的输出取整并将所有内容展平为一个长向量,而不是将其保留为多维数组。 So the flatten layer itself doesn't have any weights to learn, and the way that it calculates its output never changes. 因此,展平层本身没有学习任何权重,并且它计算其输出的方式永远不会改变。 It's actual output does change while you train because the preceding layers are being trained, and so their outputs are changing, and thus the input to flatten is changing. 训练时,它的实际输出确实发生了变化,这是因为前面的层正在训练中,因此它们的输出也在变化,因此要展平的输入也在变化。

  2. There is nothing unique about the flatten layer that would prevent backpropagation being applied to the layers before it. 平整层没有什么独特之处可以防止反向传播应用于之前的层。 If there was, that would prevent the preceding layers from being trained. 如果有的话,那将阻止对先前的层进行培训。 In order to train layers prior to the flatten there has to be backpropagation. 为了在压平之前训练层,必须进行反向传播。 Backpropagation is the process that is used to update the weights in the network. 反向传播是用于更新网络中权重的过程。 If it was never applied to the beginning layers they would never be updated, and they would never learn anything. 如果从未将其应用于起始层,则将永远不会对其进行更新,并且他们将永远不会学到任何东西。

  3. Dropout layers are used for their regularizing effect to reduce overfitting. 辍学层用于其正则化效果以减少过度拟合。 By randomly selecting some neurons to be deactivated on any given run, dropout attempts to force the network to learn more independent, robust features. 通过随机选择一些在任何给定运行中将被停用的神经元,辍学尝试迫使网络学习更多独立,强大的功能。 It can't rely on a small subset of neurons because they may not be used. 它不能依赖神经元的一小部分,因为它们可能不会被使用。 The same idea applies both before and after the flatten layer. 平坦层之前和之后都适用相同的想法。

Whether or not including dropout at specific points in your network will be useful depends on your particular use case. 是否在网络中的特定点包括辍学是否有用将取决于您的特定用例。 For example, if you aren't struggling with your network overfitting, then dropout may not help improve your results. 例如,如果您不为网络过度安装而苦苦挣扎,那么辍学可能无助于改善您的结果。 Often deciding exactly when to use dropout and how much to use is a matter of experimentation to see what works for your data. 通常,确切地决定何时使用辍学以及使用多少辍学只是一个实验问题,以查看对您的数据有效的方法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM