简体   繁体   中英

will dropout layer enhance accuracy

I know that adding a dropout layer into a CNN model enhances accuracy, since it decrease the impact of over-fitting. However, I built a CNN model with 16,32 and 64 filters, size 3 and maxpool of 2 and noticed that the model without the dropout layer performed better than the model with a dropout layer for all cases.

from keras.models import Sequential
from keras.layers import Conv2D,Activation,MaxPooling2D,Dense,Flatten,Dropout
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from IPython.display import display
import matplotlib.pyplot as plt
from PIL import Image
from sklearn.metrics import classification_report, confusion_matrix
import keras
from keras.layers import BatchNormalization
from keras.optimizers import Adam
import pickle

classifier = Sequential()
classifier.add(Conv2D(16,(3,3),input_shape=(200,200,3)))
classifier.add(Activation('relu'))
classifier.add(MaxPooling2D(pool_size =(2,2)))
classifier.add(Flatten())
classifier.add(Dense(128))
classifier.add(Activation('relu'))
classifier.add(Dropout(0.5))
classifier.add(Dense(7))
classifier.add(Activation('softmax'))
classifier.summary()
classifier.compile(optimizer =keras.optimizers.Adam(lr=0.001),
                   loss ='categorical_crossentropy',
                   metrics =['accuracy'])
train_datagen = ImageDataGenerator(rescale =1./255,
                                   shear_range =0.2,
                                   zoom_range = 0.2,
                                   horizontal_flip =True)
test_datagen = ImageDataGenerator(rescale = 1./255)

batchsize=10
training_set = train_datagen.flow_from_directory('/home/osboxes/Downloads/Downloads/Journal_Paper/Malware_Families/Spectrogram/Train/',            
                                                target_size=(200,200),
                                                batch_size= batchsize,
                                                class_mode='categorical')

test_set = test_datagen.flow_from_directory('/home/osboxes/Downloads/Downloads/Journal_Paper/Malware_Families/Spectrogram/Validate/',    
                                           target_size = (200,200),
                                           batch_size = batchsize,
                       shuffle=False,
                                           class_mode ='categorical')
history=classifier.fit_generator(training_set,
                        steps_per_epoch = 2340 // batchsize,
                        epochs = 100,
                        validation_data =test_set,
                        validation_steps = 781 // batchsize)

classifier.save('16_With_Dropout_rl_001.h5')
with open('16_With_Dropout_rl_001.h5', 'wb') as file_pi:
        pickle.dump(history.history, file_pi)
Y_pred = classifier.predict_generator(test_set, steps= 781 // batchsize+1)
y_pred = np.argmax(Y_pred, axis=1)
print('Confusion Matrix')
print(confusion_matrix(test_set.classes, y_pred))
print('Classification Report')
target_names = test_set.classes
class_labels = list(test_set.class_indices.keys()) 
target_names = ['coinhive','emotet','fareit','gafgyt','mirai','ramnit','razy']  
report = classification_report(test_set.classes, y_pred, target_names=class_labels)
print(report) 

# summarize history for accuracy
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('model accuracy 16 with dropout rl .001')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
# summarize history for loss
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss 16 with dropout rl .001')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()

在此处输入图片说明 在此处输入图片说明 在此处输入图片说明 在此处输入图片说明

I know that adding a dropout layer into a CNN model enhances accuracy, since it decrease the impact of over-fitting.

You can put it that way but it doesn't hold in general. Dropout layer is a generalization technique that decreases flexibility of your model which can prevent overfitting assuming that your model is flexible enough to handle the task (actually, assuming that your model is more flexible than needed). If your model is not capable of handling the task to begin with, meaning that it is too weak, then adding any kind of regularization will probably only worsen its performance.

That being said, CNN's usually perform better when you include more than just one convolutional layer. The idea is that deeper convolutional layers learn more complex features while the layers close to the input learn just a basic shapes (of course, this depends on the structure of the network itself and on the complexity of the task). And since you usually want to include more convolutional layers, the complexity (and flexibility) of such model raises which can lead to overfitting, therefore the need for regularization techniques. (3 convolutional layers with regularization will usually outperform one convolutional layer without regularization).

Your design only includes one convolutional layer. I would suggest stacking multiple convolutional/pooling layers on top of each other and adding some dropout layers to fight the overfitting if necessary (it is probably going to be hard to see any positive effects of regularization on such a simple model).

I agree with everything that @Matus Dubrava said but would also suggest that you try and dropout percentage much lower than 0.5. Typically, people use something between 0.15 and 0.3. I usually use 0.2. Try a couple different values and see what works best. And, like Matus suggested, try a few more convolution layers. I have had a lot of success with three CN architectures across tabular and image generation models.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM