简体   繁体   English

dropout层会提高精度吗

[英]will dropout layer enhance accuracy

I know that adding a dropout layer into a CNN model enhances accuracy, since it decrease the impact of over-fitting.我知道在 CNN 模型中添加一个 dropout 层可以提高准确性,因为它减少了过拟合的影响。 However, I built a CNN model with 16,32 and 64 filters, size 3 and maxpool of 2 and noticed that the model without the dropout layer performed better than the model with a dropout layer for all cases.但是,我构建了一个具有 16,32 和 64 个过滤器、大小为 3 且最大池为 2 的 CNN 模型,并注意到没有 dropout 层的模型在所有情况下都比具有 dropout 层的模型表现更好。

from keras.models import Sequential
from keras.layers import Conv2D,Activation,MaxPooling2D,Dense,Flatten,Dropout
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from IPython.display import display
import matplotlib.pyplot as plt
from PIL import Image
from sklearn.metrics import classification_report, confusion_matrix
import keras
from keras.layers import BatchNormalization
from keras.optimizers import Adam
import pickle

classifier = Sequential()
classifier.add(Conv2D(16,(3,3),input_shape=(200,200,3)))
classifier.add(Activation('relu'))
classifier.add(MaxPooling2D(pool_size =(2,2)))
classifier.add(Flatten())
classifier.add(Dense(128))
classifier.add(Activation('relu'))
classifier.add(Dropout(0.5))
classifier.add(Dense(7))
classifier.add(Activation('softmax'))
classifier.summary()
classifier.compile(optimizer =keras.optimizers.Adam(lr=0.001),
                   loss ='categorical_crossentropy',
                   metrics =['accuracy'])
train_datagen = ImageDataGenerator(rescale =1./255,
                                   shear_range =0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip =True)
test_datagen = ImageDataGenerator(rescale = 1./255)

batchsize=10
training_set = train_datagen.flow_from_directory('/home/osboxes/Downloads/Downloads/Journal_Paper/Malware_Families/Spectrogram/Train/',            
                                                target_size=(200,200),
                                                batch_size= batchsize,
                                                class_mode='categorical')

test_set = test_datagen.flow_from_directory('/home/osboxes/Downloads/Downloads/Journal_Paper/Malware_Families/Spectrogram/Validate/',    
                                           target_size = (200,200),
                                           batch_size = batchsize,
                       shuffle=False,
                                           class_mode ='categorical')
history=classifier.fit_generator(training_set,
                        steps_per_epoch = 2340 // batchsize,
                        epochs = 100,
                        validation_data =test_set,
                        validation_steps = 781 // batchsize)

classifier.save('16_With_Dropout_rl_001.h5')
with open('16_With_Dropout_rl_001.h5', 'wb') as file_pi:
        pickle.dump(history.history, file_pi)
Y_pred = classifier.predict_generator(test_set, steps= 781 // batchsize+1)
y_pred = np.argmax(Y_pred, axis=1)
print('Confusion Matrix')
print(confusion_matrix(test_set.classes, y_pred))
print('Classification Report')
target_names = test_set.classes
class_labels = list(test_set.class_indices.keys()) 
target_names = ['coinhive','emotet','fareit','gafgyt','mirai','ramnit','razy']  
report = classification_report(test_set.classes, y_pred, target_names=class_labels)
print(report) 

# summarize history for accuracy
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('model accuracy 16 with dropout rl .001')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
# summarize history for loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss 16 with dropout rl .001')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()

在此处输入图片说明 在此处输入图片说明 在此处输入图片说明 在此处输入图片说明

I know that adding a dropout layer into a CNN model enhances accuracy, since it decrease the impact of over-fitting.我知道在 CNN 模型中添加一个 dropout 层可以提高准确性,因为它减少了过拟合的影响。

You can put it that way but it doesn't hold in general.你可以这样说,但一般来说并不成立。 Dropout layer is a generalization technique that decreases flexibility of your model which can prevent overfitting assuming that your model is flexible enough to handle the task (actually, assuming that your model is more flexible than needed). Dropout 层是一种泛化技术,它会降低模型的灵活性,假设您的模型足够灵活以处理任务(实际上,假设您的模型比需要的更灵活),则可以防止过度拟合。 If your model is not capable of handling the task to begin with, meaning that it is too weak, then adding any kind of regularization will probably only worsen its performance.如果您的模型一开始就无法处理任务,这意味着它太弱了,那么添加任何类型的正则化可能只会降低其性能。

That being said, CNN's usually perform better when you include more than just one convolutional layer.话虽如此,当您包含不止一个卷积层时,CNN 通常会表现得更好。 The idea is that deeper convolutional layers learn more complex features while the layers close to the input learn just a basic shapes (of course, this depends on the structure of the network itself and on the complexity of the task).这个想法是更深的卷积层学习更复杂的特征,而靠近输入的层只学习基本形状(当然,这取决于网络本身的结构和任务的复杂性)。 And since you usually want to include more convolutional layers, the complexity (and flexibility) of such model raises which can lead to overfitting, therefore the need for regularization techniques.并且由于您通常希望包含更多卷积层,因此此类模型的复杂性(和灵活性)会增加,这可能导致过度拟合,因此需要正则化技术。 (3 convolutional layers with regularization will usually outperform one convolutional layer without regularization). (3 个带正则化的卷积层通常会胜过一个不带正则化的卷积层)。

Your design only includes one convolutional layer.您的设计仅包含一个卷积层。 I would suggest stacking multiple convolutional/pooling layers on top of each other and adding some dropout layers to fight the overfitting if necessary (it is probably going to be hard to see any positive effects of regularization on such a simple model).我建议将多个卷积/池化层堆叠在一起,并在必要时添加一些 dropout 层来对抗过度拟合(可能很难看到正则化对这样一个简单模型的任何积极影响)。

I agree with everything that @Matus Dubrava said but would also suggest that you try and dropout percentage much lower than 0.5.我同意@Matus Dubrava 所说的一切,但也建议您尝试将辍学率降低到远低于 0.5。 Typically, people use something between 0.15 and 0.3.通常,人们使用介于 0.15 和 0.3 之间的值。 I usually use 0.2.我通常使用0.2。 Try a couple different values and see what works best.尝试几个不同的值,看看哪个效果最好。 And, like Matus suggested, try a few more convolution layers.而且,就像 Matus 建议的那样,尝试更多的卷积层。 I have had a lot of success with three CN architectures across tabular and image generation models.我在表格和图像生成模型的三个 CN 架构方面取得了很大的成功。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM