Resnet50 (without network weights) for cifar10 dataset - improving the accuracy

Question

I need to train the Resnet50 pretrained model on cifar10 dataset, without the pretrained weights

conv_base = ResNet50(input_shape=(32,32,3), weights=None, pooling = 'avg', include_top=False)

for layer in conv_base.layers:
      layer.trainable = False

model1 = Sequential()
#model1.add(UpSampling2D(input_shape = (32,32,3))) #Upsampling is simply a way to magnify our image to make it bigger. 
#model1.add(BatchNormalization())
#model1.add(UpSampling2D())
#model1.add(BatchNormalization())
#model1.add(UpSampling2D())
#model1.add(BatchNormalization())
model1.add(conv_base)
model1.add(layers.Flatten())
#model1.add(BatchNormalization())
#model1.add(layers.Dense(1024,activation=('relu'), kernel_regularizer=l2(0.01), bias_regularizer=l2(0.01)))
#model1.add(layers.Dropout(0.5))
#model1.add(BatchNormalization())
#model1.add(layers.Dense(512, activation = 'relu', kernel_regularizer=l2(0.01), bias_regularizer=l2(0.01)))
#model1.add(layers.Dropout(0.4))
#model1.add(BatchNormalization())
#model1.add(layers.Dense(256, activation = 'relu',kernel_regularizer=l2(0.1)))
#model1.add(layers.Dropout(0.4))
#model1.add(BatchNormalization())
#model1.add(layers.Dense(128, activation = 'relu', kernel_regularizer=l2(0.1)))
#model1.add(layers.Dropout(0.4))
#model1.add(BatchNormalization())
model1.add(layers.Dense(10, activation = 'softmax'))


model1.summary()

opt = SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True)

model1.compile(
  loss='sparse_categorical_crossentropy',
  optimizer=opt,
  metrics=['accuracy']
)


lrr= ReduceLROnPlateau(
                       monitor='val_acc', 
                       factor=.01, 
                       patience=3, 
                       min_lr=1e-5)
model1.fit(X_train,y_train, batch_size = 100, validation_data = (X_val, y_val), epochs = 100, callbacks=[lrr])

The accuracy does not inprove over the epochs and the loss remains unchanged.

Can someone help me improve the accuracy? Why doesn't adding more dense layers/weight regularizations/batch normalise layers improve the accuracy?

Note: I have tried using data augmentation, it further reduced the accuracy.

Answer 1

First thing first

for layer in conv_base.layers:
      layer.trainable = False

Means you're not training anything and just use the randomized weights.

But there's also another big problem about wrong input size.

Imagenet-based model normally downsampling 5 times (from 224x224 to 7x7 for ResNet), you can't really use 32x32 images on these because they'll turn into 1x1 in last few Convolution layers which isn't a good think.

Solution

Modern models like ResNet or DenseNet use strides=2 in the first convolution layer so the easiest way is remove that and you'll have 2x2 in the last block which most of the time sufficient or you can also resize the input to 64x64.

For example, change this file here from

x = layers.Conv2D(64, (7, 7), strides=(2, 2), padding='valid',
                  kernel_initializer='he_normal', name='conv1')(x)

to

x = layers.Conv2D(64, (7, 7), strides=1, padding='valid',
                  kernel_initializer='he_normal', name='conv1')(x)

But if you don't need imagenet weight then the best way is to make a new network with only 3 blocks (downsampling 2 times). Like

    x = layers.ZeroPadding2D(padding=(3, 3), name='conv1_pad')(img_input)
    x = layers.Conv2D(64, (7, 7),
                      strides=1,
                      padding='valid',
                      kernel_initializer='he_normal',
                      name='conv1')(x)
    x = layers.BatchNormalization(axis=bn_axis, name='bn_conv1')(x)
    x = layers.Activation('relu')(x)
    x = layers.ZeroPadding2D(padding=(1, 1), name='pool1_pad')(x)
    x = layers.MaxPooling2D((3, 3), strides=(2, 2))(x)

    x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1))
    x = identity_block(x, 3, [64, 64, 256], stage=2, block='b')
    x = identity_block(x, 3, [64, 64, 256], stage=2, block='c')

    x = conv_block(x, 3, [128, 128, 512], stage=3, block='a')
    x = identity_block(x, 3, [128, 128, 512], stage=3, block='b')
    x = identity_block(x, 3, [128, 128, 512], stage=3, block='c')
    x = identity_block(x, 3, [128, 128, 512], stage=3, block='d')

    x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a')
    x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b')
    x = identity_block(x, 3, [256, 256, 1024], stage=4, block='c')
    x = identity_block(x, 3, [256, 256, 1024], stage=4, block='d')
    x = identity_block(x, 3, [256, 256, 1024], stage=4, block='e')
    x = identity_block(x, 3, [256, 256, 1024], stage=4, block='f')

    x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a', strides=(1, 1))
    x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b')
    x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c')

I simply change the stride from 2 to 1 on 2 layers but this will result in 8x8 tensor shape in the last block which is 100% enough.

The official implements like ResNet56, ResNet110 also like this.

Resnet50 (without network weights) for cifar10 dataset - improving the accuracy

Question

1 answers

solution1
0 2020-09-06 05:29:22

Solution

Resnet50 (without network weights) for cifar10 dataset - improving the accuracy

Question

1 answers

solution1 0 2020-09-06 05:29:22

Solution

solution1
0 2020-09-06 05:29:22