简体   繁体   中英

High val_loss and low val_accuracy when training ResNet50 model

Ive been training a ResNet50 model with some added layers of my own, however with each epoch comes a higher val_loss and the same val_accuracy. I think its overfitting the model but not sure how i would fix that. Im using FER2013.jpg dataset to test and train the model.

Code:

base_model = tf.keras.applications.ResNet50(input_shape=(48,48,3), include_top=False, weights='imagenet')
#ResNet model with additional convolutional layers.
model = Sequential()
model.add(base_model)
model.add(Conv2D(32, kernel_size=(3,3), activation='relu', padding='same', input_shape=(48,48,3), data_format='channels_last'))
model.add(Conv2D(64,kernel_size=(3,3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2),padding='same'))
model.add(Dropout(0.25))
model.add(Conv2D(128,kernel_size=(3,3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2), padding='same'))
model.add(Conv2D(128,kernel_size=(3,3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2), padding='same'))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(1024, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(7, activation='softmax'))

adam = tf.keras.optimizers.Adam(learning_rate=0.0001)
model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['accuracy'])
model_info = model.fit(traindata,epochs=100,validation_data=testdata)


#base_model.compile(loss='categorical_crossentrophy', optimizer='adam', metrics=['accuracy'])
#model_info = base_model.fit(traindata, steps_per_epoch=449,epochs=100,validation_data=testdata,validation_steps=112)

Im using batch_size of 128, any help would be great.

First epoch results:

Epoch 1/100 98/98 [==============================] - 276s 3s/step - loss: 1.6894 - accuracy: 0.3039 - val_loss: 3.2897 - val_accuracy: 0.1737

Epoch 2/100 98/98 [==============================] - 342s 4s/step - loss: 1.4305 - accuracy: 0.3630 - val_loss: 13.5700 - val_accuracy: 0.1737

The output shape of the ResNet model seems to be (2, 2, 2048), so it does not make sense to apply Conv2D layers to it:

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 resnet50 (Functional)       (None, 2, 2, 2048)        23587712  
                                                                 
 conv2d (Conv2D)             (None, 2, 2, 32)          589856    
                                                                 
 conv2d_1 (Conv2D)           (None, 2, 2, 64)          18496     
                                                                 
 max_pooling2d (MaxPooling2D  (None, 1, 1, 64)         0         
 )                                                               
                                                                 
 dropout (Dropout)           (None, 1, 1, 64)          0         
                                                                 
 conv2d_2 (Conv2D)           (None, 1, 1, 128)         73856     
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 1, 1, 128)        0         
 2D)                                                             
                                                                 
 conv2d_3 (Conv2D)           (None, 1, 1, 128)         147584    
                                                                 
 max_pooling2d_2 (MaxPooling  (None, 1, 1, 128)        0         
 2D)                                                             
                                                                 
 dropout_1 (Dropout)         (None, 1, 1, 128)         0         
                                                                 
 flatten (Flatten)           (None, 128)               0         
                                                                 
 dense (Dense)               (None, 1024)              132096    
                                                                 
 dropout_2 (Dropout)         (None, 1024)              0         
                                                                 
 dense_1 (Dense)             (None, 7)                 7175      
                                                                 
=================================================================
Total params: 24,556,775
Trainable params: 24,503,655
Non-trainable params: 53,120

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM