我的卷积神经网络过度拟合

Question

Recently, I have built a simple convolutional neural network for hand gesture image recognition using background subtraction to make the hand a white shape on the screen with a black background.最近，我构建了一个简单的卷积神经网络用于手势图像识别，使用背景减法使手在屏幕上呈现为黑色背景的白色形状。 It was built using keras Conv2D for the most part.它大部分是使用 keras Conv2D 构建的。 My dataset has 1000 pics for training and 100 pics for validation and testing.我的数据集有 1000 张用于训练的图片和 100 张用于验证和测试的图片。 The problem oddly occurs immediately after the first epoch, during which the model's loss goes down a great deal.奇怪的是，这个问题在第一个 epoch 之后立即发生，在此期间模型的损失下降了很多。 It usually goes down from some big number like 183 to 1 at the start of the second epoch.它通常在第二个纪元开始时从 183 之类的大数字下降到 1。 All the pics from the dataset are from my own hand using cv2, but I only conducted testing with my own hand, so that should not be any problem.数据集中的所有图片都是我自己使用cv2手工拍摄的，但我只是用自己的手进行了测试，所以应该没有任何问题。 In case the dataset was the problem, I have tried to take 3 different datasets, one using cv2's Canny method, which essentially traces a line of the hand and makes the rest of the pic black to see if that made a difference.如果数据集是问题所在，我尝试使用 3 个不同的数据集，其中一个使用 cv2 的 Canny 方法，该方法基本上跟踪手的一条线并使图片的其余部分变黑，以查看是否有所不同。 Regardless, the same thing continued to happen.无论如何，同样的事情继续发生。 Furthermore, I have added multiple Dropout layers in different places to see the effect and the same thing always occurs in which the loss drastically decreases and it shows signs of overfitting.此外，我在不同的地方添加了多个 Dropout 层来查看效果，同样的事情总是发生，其中损失急剧下降，并显示出过度拟合的迹象。 I have also implemented EarlyStopping and multiple layers to see if that helped, but the same results seems to always occurs.我还实现了 EarlyStopping 和多层，看看是否有帮助，但似乎总是出现相同的结果。

model = Sequential()
model.add(Conv2D(32, (3,3), activation = 'relu',
    input_shape = (240, 215, 1)))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3,3), activation = 'relu'))
model.add(Conv2D(64, (3,3), activation = 'relu'))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(0.25))
model.add(Conv2D(128, (3,3), activation = 'relu'))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(0.25))
model.add(Conv2D(256, (3,3), activation = 'relu'))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(0.25))
 #model.add(Conv2D(256, (3,3), activation = 'relu'))
 #model.add(MaxPooling2D((2,2)))
 #model.add(Conv2D(128, (3,3), activation = 'relu'))
 #model.add(MaxPooling2D((2,2)))
 #model.add(Conv2D(64, (3,3), activation = 'relu'))
 #model.add(MaxPooling2D((2,2)))
model.add(Flatten())
model.add(Dense(150, activation = 'relu'))
 #model.add(Dropout(0.25))
 #model.add(Dense(1000, activation = 'relu'))
model.add(Dropout(0.75))
model.add(Dense(6, activation = 'softmax'))
model.summary()
model.compile(optimizer = 'adam', loss = 'categorical_crossentropy',
        metrics = ['acc'])
callbacks_list = [EarlyStopping(monitor = 'val_loss', patience = 10),
        ModelCheckpoint(filepath = 'model.h6', monitor = 'val_loss',
        save_best_only = True),]

The commented sections of the code are changes I have tried to implement.代码的注释部分是我尝试实现的更改。 I have also varied the Dropout values and positions of them a great deal and nothing significant has changed.我还对 Dropout 值和它们的位置进行了大量更改，但没有任何重大变化。 Could anyone offer any advice on why my model overfits that quickly?任何人都可以就为什么我的模型如此迅速地过度拟合提供任何建议吗？

Answer 1

When dealing with such a massive overfitting phenomenon, a good starting point would be to reduce your number of layers.在处理如此大规模的过拟合现象时，一个好的起点是减少层数。

Although you add a Dropout after many max-poolings, you still suffer from the overfitting phenomenon.尽管您在多次最大池化后添加了Dropout ，但您仍然会遇到过拟合现象。

Here below I present some of my recommendations:下面我提出一些我的建议：

Ensure that you have a comprehensive dataset with clean labels.确保您拥有一个带有干净标签的综合数据集。 Regardless of how we might want to tune the neural network, if the dataset is not clean, we cannot obtain good results.不管我们想如何调整神经网络，如果数据集不干净，我们就无法获得好的结果。
Add (for the beginning), maximum 3 stacks of convolution + max_pooling + dropout.添加（开始时），最多 3 组卷积 + max_pooling + dropout。 (32 + 64 + 128) would be a good starting point. (32 + 64 + 128) 将是一个很好的起点。
Use GlobalAveragePooling2D instead of Dense layers.使用GlobalAveragePooling2D而不是Dense层。 The latter are not needed in a Convolutional Neural Network, except for the last layer with sigmoid or softmax .后者在卷积神经网络中不需要，除了带有sigmoid或softmax的最后一层。
Try using SpatialDropout2D .尝试使用SpatialDropout2D 。 As compared to typical Dropout , which is applied to each element in the feature map, SpatialDropout drops entire feature maps.与应用于特征图中每个元素的典型Dropout相比，SpatialDropout 丢弃整个特征图。
Try to use Data Augmentation.尝试使用数据增强。 In this way, you create more artificial examples and your network will be less prone to overfitting.通过这种方式，您可以创建更多人工示例，并且您的网络将不太容易过度拟合。
If none of these work, ensure that you use a pre-trained network and you apply transfer learning to your task at hand.如果这些都不起作用，请确保您使用预训练的网络并将迁移学习应用于您手头的任务。

Answer 2

Yes, it is a clear case of overfitting.是的，这是一个明显的过拟合案例。 Here are my suggestions:以下是我的建议：

Try reducing the hidden layers尝试减少隐藏层
Increase the drop out to 0.5将 dropout 增加到 0.5
Create more synthetic images or apply transformations on the raw images.创建更多合成图像或对原始图像应用转换。

我的卷积神经网络过度拟合

问题描述

2 个解决方案

解决方案1
0 2020-01-31 06:58:02

解决方案2
0 2020-02-01 02:40:13

我的卷积神经网络过度拟合

问题描述

2 个解决方案

解决方案1 0 2020-01-31 06:58:02

解决方案2 0 2020-02-01 02:40:13

解决方案1
0 2020-01-31 06:58:02

解决方案2
0 2020-02-01 02:40:13