Low accuracy after training a CNN

Question

I try to train a CNN model that classifies the handwritten digit using Keras, but I am getting low accuracy in the training (lower than 10%) and a big error. I tried a simple neural network without concolutions and it didn't work as well.

This is my code.

import tensorflow as tf
from tensorflow import keras
import numpy as np
import matplotlib.pyplot as plt

(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

#Explore data
print(y_train[12])
print(np.shape(x_train))
print(np.shape(x_test))
#we have 60000 imae for the training and 10000 for testing

# Scaling data
x_train = x_train/255
y_train = y_train/255
#reshape the data
x_train = x_train.reshape(60000,28,28,1)
x_test = x_test.reshape(10000,28,28,1)
y_train = y_train.reshape(60000,1)
y_test = y_test.reshape(10000,1)

#Create a model
model = keras.Sequential([
keras.layers.Conv2D(64,(3,3),(1,1),padding = "same",input_shape=(28,28,1)),
keras.layers.MaxPooling2D(pool_size = (2,2),padding = "valid"),
keras.layers.Conv2D(32,(3,3),(1,1),padding = "same"),
keras.layers.MaxPooling2D(pool_size = (2,2),padding = "valid"),
keras.layers.Flatten(),
keras.layers.Dense(128,activation = "relu"),
keras.layers.Dense(10,activation = "softmax")])

model.compile(optimizer = "adam",
loss = "sparse_categorical_crossentropy",
metrics  = ['accuracy'])

model.fit(x_train,y_train,epochs=10)
test_loss,test_acc = model.evaluate(x_test,y_test)
print("\ntest accuracy:",test_acc)

Could anyone advice me on how to improve my model?

Answer 1

Your problem is here:

x_train = x_train/255
y_train = y_train/255 # makes no sense

You should have rescaled x_test , not y_train .

x_train = x_train/255
x_test = x_test/255

That was probably just a typo from your part. Change these lines and you'll have 95%+ accuracy.

Answer 2

You model have some scaling problem and try to use tf 2.0

x_train /= 255
x_test /= 255

you don't need to scale all data of test as you have done:

x_train = x_train/255
y_train = y_train/255

Afterwards, we can transform the labels into a one-hot encoding

from tensorflow.keras.utils import to_categorical

y_train = to_categorical(y_train, 10)

y_test = to_categorical(y_test, 10)

which helps in the:

loss='categorical_crossentropy',

The Sequential API allows us to stack layers on top of each other. The only downside is that we cannot have multiple outputs or inputs when using these models. Nevertheless, we can create a Sequential object and use the add() function to add layers to our model. Try to use more API that make your model more smooth and accurate as using add function is present on Tf 2.0 As we can give Conv2D 4 time to make smooth:

seq_model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu', 
input_shape=x_train.shape[1:]))
seq_model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu'))
seq_model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
seq_model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))

in the code you can use dropout:

seq_model.add(Dropout(rate=0.25))

Full model:

%tensorflow_version 2.x
from tensorflow.keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()    
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
from tensorflow.keras.utils import to_categorical
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPool2D, Dense, Flatten, Dropout

seq_model = Sequential()
seq_model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu', 
input_shape=x_train.shape[1:]))
seq_model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu'))
seq_model.add(MaxPool2D(pool_size=(2, 2)))
seq_model.add(Dropout(rate=0.25))
seq_model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
seq_model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
seq_model.add(MaxPool2D(pool_size=(2, 2)))
seq_model.add(Dropout(rate=0.25))
seq_model.add(Flatten())
seq_model.add(Dense(256, activation='relu'))
seq_model.add(Dropout(rate=0.5))
seq_model.add(Dense(10, activation='softmax'))


seq_model.compile(
    loss='categorical_crossentropy', 
    optimizer='adam', 
    metrics=['accuracy']
)

epochsz = 3 # number of epch 
batch_sizez = 32 # the batch size ,can be 64 , 128 so other
seq_model.fit(x_train,y_train, batch_size=batch_sizez, epochs=epochsz)

Result:

Train on 60000 samples
Epoch 1/3

60000/60000 [==============================] - 186s 3ms/sample - loss: 0.1379 - accuracy: 0.9588

Epoch 2/3
60000/60000 [==============================] - 187s 3ms/sample - loss: 0.0677 - accuracy: 0.9804
Epoch 3/3

60000/60000 [==============================] - 187s 3ms/sample - loss: 0.0540 - accuracy: 0.9840

Low accuracy after training a CNN

Question

2 answers

solution1
2 ACCPTED 2019-12-13 15:37:49

solution2
1 2020-01-12 18:26:48

Low accuracy after training a CNN

Question

2 answers

solution1 2 ACCPTED 2019-12-13 15:37:49

solution2 1 2020-01-12 18:26:48

solution1
2 ACCPTED 2019-12-13 15:37:49

solution2
1 2020-01-12 18:26:48