简体   繁体   中英

CNN accuracy: 0.0000e+00 for multi-classification on images

I have the following code that produces my horrible accuracy dilema, has anyone else encountered this issue for multi classification task(49 different images to classify)?

I am running resnet50 on top of my CNN model with softmax as last activation FN , my loss is categorical_crossentropy and my optimizer is Adam .

What might I be doing wrong?

## Build CNN architecture
model1 = Sequential()
model1.add(Conv2D(32, (3,3), strides=1, input_shape = (720, 720, 3)))
model1.add(Activation('relu'))
model1.add(Conv2D(32, (3,3), strides=1, padding="same"))
model1.add(Activation('relu'))
model1.add(MaxPooling2D(pool_size=(2,2)))

model1.add(Conv2D(64, (3,3), strides=1, padding="same"))
model1.add(Activation('relu'))
model1.add(Conv2D(64, (3,3), strides=1, padding="same"))
model1.add(Activation('relu'))
model1.add(MaxPooling2D(pool_size=(2,2)))

model1.add(Flatten())
model1.add(Dense(200))
model1.add(Activation('relu'))
model1.add(Dense(200))
model1.add(Dropout(0.24))
model1.add(Activation('relu'))
model1.add(Dense(49, activation='softmax')) 

model1.summary()

# Image data generator for on the fly image augmentation
directory = '/home/carlini-TF2/data/train/'
batch_size = 64
train_datagen = tf.keras.preprocessing.image.ImageDataGenerator(
        rotation_range=90.,
        shear_range=0.2,
        zoom_range=[0.8,1.2],
        horizontal_flip=True,
        validation_split=0.2,
        preprocessing_function=tf.keras.applications.resnet50.preprocess_input)
train_generator = train_datagen.flow_from_directory(directory=directory,
                                                    subset='training',
                                                    target_size=(720, 720),
                                                    shuffle=True,
                                                    seed=42,
                                                    color_mode='rgb', 
                                                    class_mode='categorical', 
                                                    batch_size=batch_size)
valid_directory = '/home/carlini-TF2/data/test/'
valid_generator = train_datagen.flow_from_directory(directory=valid_directory,
                                                    target_size=(720, 720),
                                                    color_mode="rgb",
                                                    batch_size=batch_size,
                                                    class_mode="categorical",
                                                    subset='validation',
                                                    shuffle=True,
                                                    seed=42)

## Compile and train Neural Network 
METRICS = [
        tf.keras.metrics.Accuracy(name='accuracy'),
        tf.keras.metrics.Precision(name='precision'),
        tf.keras.metrics.Recall(name='recall')]

# optimal optimizer FN | loss FN to work with accuracy metric
model1.compile(loss=tf.keras.losses.CategoricalCrossentropy(from_logits=False),
               optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
               metrics=METRICS)

# stop training when loss gets worse after consecutive epochs
callback = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=3) 

# fit model with augmented training set and validation set | shuffle batch 
history = model1.fit(train_generator,
                    validation_data = valid_generator,
                    steps_per_epoch = train_generator.n//batch_size,
                    validation_steps = valid_generator.n//batch_size,
                    shuffle=True, callbacks = [callback],
                    epochs=50)

The issue is that ResNet50 was being used for data augmentation and not in the CNN architecture. In order to reach somewhat robust model the following code is needed.

We can throw out the previous architecture and use a very simple model and the ResNet50 since this gives conclusive results.

We must use Functional API since ResNet50 was built on it

data_bias = np.log(1802./4657) 
initializer = tf.keras.initializers.Constant(data_bias)

resnet50_imagenet_model = tf.keras.applications.ResNet50(weights='imagenet', include_top=False, input_shape=(720,720,3) )
resnet50_imagenet_model.trainable = False

#Flatten output layer of Resnet
flattened = tf.keras.layers.Flatten()(resnet50_imagenet_model.output)

#Fully connected layer, output layer with 49 diff labels
fc2 = tf.keras.layers.Dense(49, activation='softmax', bias_initializer=initializer, name="AddedDense2")(flattened)

model1 = tf.keras.models.Model(inputs=resnet50_imagenet_model.input, outputs=fc2)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM