简体   繁体   中英

Output labels sticking to same value in vgg16

I am trying to run VGG16 model for digit detection on svhn dataset

http://ufldl.stanford.edu/housenumbers/train_32x32.mat

However, the values of prediction are always same. I tried to feed images with -

  1. scaling from 0-255 to 0-1

  2. subtracting mean from each image

  3. divide by std

Here is how I am running it:

Initialising VGG16:

vgg = tf.keras._impl.keras.applications.vgg16.VGG16 
    (include_top=True,
    weights=None,
    input_tensor=None,
    input_shape=(64,64,3),
    pooling='max',
    classes=10)

vgg.compile(loss=tf.keras.losses.categorical_crossentropy,  
optimizer=tf.keras.optimizers.SGD(lr=1e-4, momentum=0.9, nesterov=True),metrics=['mae', 'acc'])
#Here train_data shape is (None,64,64,3) and labels_data shape is (None,10) and are one hot like labels
vgg.fit(train_data, labels_data, epochs=5, batch_size=96)

Train data can be read and preprocessed like this:

train_data = sio.loadmat('datasets/housing/train_32x32.mat')

Below two functions I am using to preprocess train_data:

import numpy as np
import cv2
import tensorflow as tf
import os
import scipy
from skimage import data, io, filters
import scipy.io as sio
from utils import *
import h5py

from tensorflow.python.keras._impl.keras import backend as K
from tensorflow.python.keras._impl.keras.applications.imagenet_utils         import _obtain_input_shape
from tensorflow.python.keras._impl.keras.applications.imagenet_utils import decode_predictions  # pylint: disable=unused-import
from tensorflow.python.keras._impl.keras.applications.imagenet_utils import preprocess_input  # pylint: disable=unused-import
from tensorflow.python.keras._impl.keras.engine.topology import get_source_inputs
from tensorflow.python.keras._impl.keras.layers import Conv2D
from tensorflow.python.keras._impl.keras.layers import Dense
from tensorflow.python.keras._impl.keras.layers import Flatten
from tensorflow.python.keras._impl.keras.layers import GlobalAveragePooling2D
from tensorflow.python.keras._impl.keras.layers import     GlobalMaxPooling2D
from tensorflow.python.keras._impl.keras.layers import Input
from tensorflow.python.keras._impl.keras.layers import MaxPooling2D
from tensorflow.python.keras._impl.keras.models import Model
from tensorflow.python.keras._impl.keras.utils import layer_utils
from tensorflow.python.keras._impl.keras.utils.data_utils import get_file
from tensorflow.python.keras._impl.keras.engine import Layer

def reshape_mat_vgg (QUANTITY,matfilepath='datasets/housing/train_32x32.mat', type="non-quantized", size=(64,64)):
data = read_mat_file (matfilepath)
train = data['X'][:,:,:,0:QUANTITY]
train = np.swapaxes(np.swapaxes(np.swapaxes(train,2,3),1,2),0,1)
labels = data['y'][0:QUANTITY]
labels_data = []; labels_data = np.array(labels_data)
train_data = np.zeros((QUANTITY,size[0],size[1],3))
print "Reorganizing Data..."
for i in range(QUANTITY):
    image_i = np.copy(train[i,:,:,:])
    image_i = preprocess_small_vgg16 (image_i, new_size=size, type=type)
    train_data[i,:,:,:] = image_i
    label_i = np.zeros((10)); label_i[labels[i]-1] = 1.0; label_i = label_i.reshape(1,10)
    if i == 0:
        labels_data = np.vstack(( label_i ))
    else:
        labels_data = np.vstack(( labels_data, label_i ))
    if i % 1000 == 0:
        print i*100/(QUANTITY-1),"percent done..."
print "100 percent done..."
return train_data, labels_data



def preprocess_small_vgg16 (image, new_size=(64,64), type="non-quantized"):    
img = np.copy (image)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
#Whitening
imgnorm = img * 255.0/np.max(img)
#Normalisation
gray = imgnorm
borderless  = gray[:,5:gray.shape[1]-5,:] # centering
final_img = (cv2.resize (borderless, new_size))
final_img = final_img/np.max(final_img) #scaling 0-1
stddev = np.std(final_img); mn = np.mean (final_img)
final_img = (final_img - mn) / stddev #standardizing
return final_img

Output :

Epoch 1/10
5000/5000 [==============================] - 1346s - loss: 3.2877 - 
mean_absolute_error: 0.0029 - acc: 0.1850         
Epoch 2/10

Running over multiple epochs does not help. I tried with 5 epochs.

When I check output or predictions, it shows same results for all inputs eg(converted using np.argmax(pred, axis=-1)):

[3 3 3 . . . 3 3 3]

Please mark the problem in my model.

I think most of your problem is that 5 epochs are far from enough for VGG weights to converge. In VGG original paper, authors said that they trained 370k iterations (74 epochs) to get the best results, so you may consider giving it more time to converge.

I have fed your dataset to on of tensorflow tutorial nets and you can see how the total loss decreases after thousands of iterations.

total_loss曲线

Of course you can fine tune learning rate decay to prevent time spent on plateaus.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM