Keras Val_acc is good but prediction for same data is poor

Question

I am using Keras for a CNN two class classification. While training my val_acc is above 95 percent. But when I predict result for the same validation data the acc is less than 60 percent, is that even possible? This is my Code:

from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Convolution2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras import backend as K
from keras.callbacks import TensorBoard
from keras.preprocessing import image
import matplotlib.pyplot as plt
import numpy as np
np.random.seed(1337) # for reproducibility
%matplotlib inline

img_width, img_height = 230,170

train_data_dir = 'data/Train'
validation_data_dir = 'data/Validation'
nb_train_samples =  13044
nb_validation_samples = 200
epochs =14
batch_size = 32

if K.image_data_format() == 'channels_first':
    input_shape = (1, img_width, img_height)
else:
    input_shape = (img_width, img_height, 1)

model = Sequential()

model.add(Convolution2D(32, (3, 3),data_format='channels_first' , input_shape=(1,230,170))) 
convout1 = Activation('relu')
model.add(convout1)
convout2 = MaxPooling2D(pool_size=(2,2 ), strides= None , padding='valid', data_format='channels_first')
model.add(convout2)

model.add(Convolution2D(32, (3, 3),data_format='channels_first'))
convout3 = Activation('relu')
model.add(convout3)
model.add(MaxPooling2D(pool_size=(2, 2), data_format='channels_first'))

model.add(Convolution2D(64, (3, 3),data_format='channels_first'))
convout4 = Activation('relu')
model.add(convout4)
convout5 = MaxPooling2D(pool_size=(2, 2), data_format='channels_first')
model.add(convout5)

model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))

model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=['accuracy'])

train_datagen = ImageDataGenerator(rescale=1. / 255, 
                                   shear_range=0, 
                                   zoom_range=0.2, 
                                   horizontal_flip=False, 
                                   data_format='channels_first')

test_datagen = ImageDataGenerator(rescale=1. / 255, 
                                  data_format='channels_first')
train_generator = train_datagen.flow_from_directory(
    train_data_dir,
    target_size=(img_width, img_height),
    batch_size=batch_size,
    class_mode='binary',
    color_mode= "grayscale",
    shuffle=True
)
validation_generator = test_datagen.flow_from_directory(
    validation_data_dir,
    target_size=(img_width, img_height),
    batch_size=batch_size,
    class_mode='binary',
    color_mode= "grayscale",
    shuffle=True
)
model.fit_generator(
    train_generator,
    steps_per_epoch=nb_train_samples // batch_size,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=nb_validation_samples // batch_size,
    shuffle=True
    )

Epoch 37/37

407/407[==============] - 1775s 4s/step - loss: 0.12 - acc: 0.96 - val_loss: 0.02 - val_acc: 0.99

#Prediction:
test_data_dir='data/test'
validgen = ImageDataGenerator(horizontal_flip=False, data_format='channels_first')
test_gen = validgen.flow_from_directory(
         test_data_dir,
         target_size=(img_width, img_height),
         batch_size=1,
         class_mode='binary',
         shuffle=False,
         color_mode= "grayscale")

preds = model.predict_generator(test_gen)

In the below output about 7 images belong to class 0. I tried the same for all 100 images of the class 0 validation data and only 15 images were predicted as class 0 and remaining was predicted as class 1

Found 10 images belonging to 1 classes.
[[ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 1.]
 [ 0.]
 [ 0.]
 [ 1.]]

Answer 1

You are not scaling your test images by 1./255 as you have in your training and validation images. Ideally, the statistics of your test data should be similar to the training data.

Answer 2

So, I have decided to post the answer I had posted in Quora but with the essential part as advised. I too had a similar problem as this one and I hope my answer can help someone else as well. I decided to research on the Internet and came across this answer by cjbayron .

What helped me solve a similar issue was that I had the following in my code for training the model:

import keras
import os
from keras import backend as K
import tensorflow as tf
import random as rn
import numpy as np

os.environ['PYTHONHASHSEED'] = '0'
np.random.seed(70)
rn.seed(70)
tf.set_random_seed(70)

/******* code for my model ******/

#very important here to save session after completing model.fit 

model.fit_generator(train_batches, steps_per_epoch=4900, validation_data=valid_batches,validation_steps=1225, epochs=40, verbose=2, callbacks=callbacks_list)

saver = tf.train.Saver()
sess = keras.backend.get_session()
saver.save(sess, 'gdrive/My Drive/KerasCNN/model/keras_session/session.ckpt')

the saved session will generate the following files as well:

/keras_session/checkpoint
/keras_session/session.ckpt.data-00000-of-00001
/keras_session/session.ckpt.index
/keras_session/session.ckpt.meta

I downloaded all these files from my Google Drive as well and placed them in a local directory. You might notice that there appears to be no file named session.ckpt only but is being used in saver.restore(). This is okay. Tensorflow kinda works it out. It will not bring an error.

During model.load_model()

So in my Pycharm, I loaded the model as follows:

model=load_model('C:\\Users\\Username\\PycharmProjects\\MyProject\\mymodel\\mymodel.h5')

saver = tf.train.Saver()
sess = keras.backend.get_session()
saver.restore(sess,'C:\\Users\\Username\\PycharmProjects\\MyProject\\mymodel\\keras_session\\session.ckpt')

/***** then predict the images as you wish ******/
pred = model.predict_classes(load_image(os.path.join(test_path, file)))

It is important to place the restore code as shown ie after loading the model. Once I did this, I tried predicting same images I used for training and validation and this time round, the model wrongly predicted around 2 images per class. Now I was sure that my model was okay and I went ahead to predict with my test images ie images it had not seen before and it performed very well.

Keras Val_acc is good but prediction for same data is poor

Question

2 answers

solution1
3 ACCPTED 2018-02-19 04:39:14

solution2
0 2019-04-30 18:21:08

Keras Val_acc is good but prediction for same data is poor

Question

2 answers

solution1 3 ACCPTED 2018-02-19 04:39:14

solution2 0 2019-04-30 18:21:08

solution1
3 ACCPTED 2018-02-19 04:39:14

solution2
0 2019-04-30 18:21:08