简体   繁体   中英

How can I fix CNN layer dimension errors with a fixed kernel size and fixed number of filters?

I am currently trying to recreate a CNN model used in the paper

"Using CNN for facial expression recognition: a study of the effects of kernel size and numbers of filters on accuracy" by Abhinav Agrawal and Namita Mittal ( https://doi.org/10.1007/s00371-019-01630-9 ).

It's unique in that it uses fixed kernel sizes and fixed number of filters and without dropout layers or a fully connected layer. The proposed model from the paper is as follows:

Input data (64 × 64) grayscale image
Data augmentation
CONV 8 × 8 × 32,BATCH NORM
CONV 8 × 8 × 32, RELU, STRIDE (2 × 2)
CONV 8 × 8 × 32, BATCH NORM
CONV 8 × 8 × 32, RELU, STRIDE (2 × 2)
CONV 8 × 8 × 32, BATCH NORM
CONV 8 × 8 × 32, RELU, STRIDE (2 × 2)
CONV 8 × 8 × 32, BATCH NORM
CONV 8 × 8 × 32, RELU, STRIDE (2 × 2)
CONV 8 × 8 × 32, BATCH NORM
CONV 8 × 8 × 32, RELU, STRIDE (2 × 2)
CONV 8 × 8 × 32, BATCH NORM
CONV 8 × 8 × 32, RELU, STRIDE (2 × 2)
CONV 8 × 8 × 32, BATCH NORM
CONV 8 × 8 × 32, RELU, STRIDE (2 × 2)
CONV 8 × 8 × 32, BATCH NORM
CONV 7 × 7 × 7, RELU, STRIDE (1 × 1)
SOFTMAX

My code is as follows and tries to faithfully recreate the model from the paper, using the same dataset and images.

import tensorflow as tf

from tensorflow import keras
from tensorflow.keras import datasets, layers, models
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D, BatchNormalization
from tensorflow.keras.models import Sequential
#from tensorflow.keras.optimizers import Adam
import matplotlib.pyplot as plt

def plot_model_acc(history):
    #model history for accuracy
    plt.plot(history.history['accuracy'], label='accuracy')
    plt.plot(history.history['val_accuracy'], label = 'val_accuracy')
    plt.title('Model Accuracy')
    plt.xlabel('Epoch')
    plt.ylabel('Accuracy')
    plt.legend(['train','test'], loc='upper left')
    plt.savefig('model_acc.png', bbox_inches='tight')
    plt.close()

def plot_model_loss(history):
    #model history for loss
    plt.plot(history.history['loss'], label='loss')
    plt.plot(history.history['val_loss'], label='val_loss')
    plt.title('Model Loss')
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['train', 'test'], loc='upper left')
    plt.savefig('model_loss.png', bbox_inches='tight')
    plt.close()



#model parameters
train_directory = 'train'
test_directory = 'test'
val_directory = 'validation'
num_train = 28709
num_test = 7178
num_val = 7178
batch_size = 64
num_epoch = 10


#creating datagen class and rescaling pixel values from  0-255 to 0-1 for grayscale
train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)
val_datagen = ImageDataGenerator(rescale=1./255)

#angry 0, disgusted 1, fearful 2, happy 3, neutral 4, sad 5, surprised 6

train_iterator = train_datagen.flow_from_directory(
    train_directory,
    target_size=(64, 64),
    class_mode='categorical',
    batch_size=batch_size,
    color_mode='grayscale')

test_iterator = test_datagen.flow_from_directory(
    test_directory,
    target_size=(64, 64),
    class_mode='categorical',
    batch_size=batch_size,
    color_mode='grayscale')

val_iterator = val_datagen.flow_from_directory(
    val_directory,
    target_size=(64, 64),
    class_mode='categorical',
    batch_size=batch_size,
    color_mode='grayscale')
#colormode grayscale since rescale=1./255
#class_mode is categorical, returns 2D one-hot encoded label

#model creation
#output layer size after conv layer is 
#[(W−K+2P)/S]+1.
#W is the input volume - in your case 64 (for 64x64x1 image)
#K is the Kernel size - in your case 8
#P is the padding - in your case 0
#S is the stride - which you have not provided.


model = Sequential()

model.add(Conv2D(32, kernel_size=(8, 8), input_shape=(64,64,1)))
model.add(BatchNormalization())
model.add(Conv2D(32, kernel_size=(8, 8), activation='relu', strides=(2,2)))
model.add(Conv2D(32, kernel_size=(8, 8)))
model.add(BatchNormalization())

model.summary()

model.add(Conv2D(32, kernel_size=(8, 8), activation='relu', strides=(2,2)))
model.add(Conv2D(32, kernel_size=(8, 8)))
model.add(BatchNormalization())
model.add(Conv2D(32, kernel_size=(8, 8), activation='relu', strides=(2,2)))
model.add(Conv2D(32, kernel_size=(8, 8)))
model.add(BatchNormalization())

model.summary()

model.add(Conv2D(32, kernel_size=(8, 8), activation='relu', strides=(2,2)))
model.add(Conv2D(32, kernel_size=(8, 8)))
model.add(BatchNormalization())
model.add(Conv2D(32, kernel_size=(8, 8), activation='relu', strides=(2,2)))
model.add(Conv2D(32, kernel_size=(8, 8)))
model.add(BatchNormalization())

model.summary()

model.add(Conv2D(32, kernel_size=(8, 8), activation='relu', strides=(2,2)))
model.add(Conv2D(32, kernel_size=(8, 8)))
model.add(BatchNormalization())
model.add(Conv2D(32, kernel_size=(8, 8), activation='relu', strides=(2,2)))
model.add(Conv2D(32, kernel_size=(8, 8)))
model.add(BatchNormalization())

model.summary()

model.add(Conv2D(7, kernel_size=(7, 7), strides=(1,1)))
model.add(layers.Activation('softmax'))

model.summary()


#optimizer definition, learning rate, lower takes more time, but high values may cause unstable training
adam_optimizer = keras.optimizers.Adam(learning_rate=0.0001)

#CategoricalCrossentropy for one-hot encoded labels and dense layer uses softmax activation, otherwise use other loss functions
model.compile(optimizer=adam_optimizer,
              loss=tf.keras.losses.CategoricalCrossentropy(from_logits=True),
              metrics=['accuracy'])

history = model.fit(
    train_iterator, 
    steps_per_epoch=num_train // batch_size, 
    epochs=num_epoch,
    validation_data=test_iterator,
    validation_steps=num_val // batch_size
    )

print(history.history.keys())
plot_model_acc(history)
plot_model_loss(history)
model.save_weights('model.h5')

I was running into the error:

"tensorflow.python.framework.errors_impl.InvalidArgumentError: Negative dimension size caused by subtracting 8 from 6 for 'conv2d_4/Conv2D' with input shapes: [?, 6, 6, 32], [8, 8, 32, 32]".

I realize that the output shape from the consecutive convolution layers might be becoming too small and thus throwing me the error. How is it possible to fix this? Padding? What is throwing me off is that I can't edit the kernel size or number of filters to fix the error.

Any insight would be appreciated.

set your padding to same in each convolution layer:

model.add(Conv2D(32, kernel_size=(8, 8),padding='same'))

default padding is valid so it doesn't perform padding. same allows you to add padding so the output_shape is the same than the input_shape.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM