简体   繁体   中英

Why is a CNN model struggling to classify a colored MNIST?

I'm trying to classify colored MNIST digits with a basic CNN architecture on Keras. Here is the piece of code that colors the original dataset into purely either red, green or blue.

def load_norm_data():
  ## load basic mnist
  (x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
  train_images = np.zeros((*x_train.shape, 3)) # orig shape: (60 000, 28, 28, 1) -> rgb shape: (60 000, 28, 28, 3)
  for num in range(x_train.shape[0]):
    rgb = np.random.randint(3) 
    train_images[num, ..., rgb] = x_train[num]/255
  return train_images, y_train
  
if __name__ == '__main__':
  ims, labels = load_norm_data()
  for num in range(10):
    plt.subplot(2, 5, num+1)
    plt.imshow(ims[num])
    plt.axis('off')

which gives for the first couple of digits:

在此处输入图像描述

Then, I attempt to classify this colored dataset into the same 10 digit classes of MNIST, so the labels aren't changing --and yet the models accuracy drops from 95% for non-colored MNIST, to wildly variable 30-70% on colored MNIST, vastly depending on weight initialization... Please find below the architecture of said model:

model = keras.Sequential()
model.add(keras.layers.Conv2D(64, kernel_size=(3,3), padding='same'))
model.add(keras.layers.MaxPool2D(pool_size=(2,2)))
model.add(keras.layers.Conv2D(64, kernel_size=(3,3), padding='same'))
model.add(keras.layers.MaxPool2D(pool_size=(2,2), padding='same'))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(10, activation='relu'))
model.add(keras.layers.Softmax())

input_shape = train_images.shape
model.build(input_shape)

model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()

model.fit(train_images, train_numbers, batch_size=12, epochs=25)

Initially, I thought that this drop in performance might be linked to data irregularity (eg imagine a lot of 3s in the data ended up being green, thus the model learns green = 3). So I checked the data, the counts are good and the rgb distribution for each class is near 33% for each color too. I also checked the misclassified images to see if there were many representatives of a certain color or digit, but it doesn't seem to be the case either. In any case, after reading Keras' documentation and because of the fact that Conv2D forces you to pass it a 2-dimensional kernel_size that I imagine thus operates on all channels of the input image, the model shouldn't be taking color into account for classification here.

Am I missing something here?

Please let me know if you need any further information. Thank you in advance.

The last part of the model includes a dense -> relu -> softmax. The relu activation should be removed. In addition, you might benefit from adding non-linearities (eg, relu) in your convolutional blocks. Otherwise, the neural network will end up being a (big) linear function and will not work as well for non-linear data.

model = keras.Sequential()
model.add(keras.layers.Conv2D(64, kernel_size=(3,3), padding='same', activation='relu'))
model.add(keras.layers.MaxPool2D(pool_size=(2,2)))
model.add(keras.layers.Conv2D(64, kernel_size=(3,3), padding='same', activation='relu'))
model.add(keras.layers.MaxPool2D(pool_size=(2,2), padding='same'))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(10))
model.add(keras.layers.Softmax())

It is interesting that the original model worked well on the MNIST dataset. I cannot say for sure why, but perhaps the MNIST dataset is simple enough that the model was able to cope. Also, the relu -> softmax would clamp negative values to 0, and maybe there were not many negative values.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM