简体   繁体   English

在 Keras 的 MNIST 数字识别中获得不同的测试数据准确度

[英]Getting different accuracy on test data in MNIST digit recognition in Keras

I am doing handwritten digit recognition using Keras and I have two files: predict.py and train.py .我做手写数字识别使用Keras和我有两个文件:predict.pytrain.py。

train.py trains the model (if it is not already trained) and saves it to a directory, otherwise it would just load the trained model from the directory it was saved to and prints the Test Loss and Test Accuracy . train.py训练模型(如果它还没有训练过)并将其保存到一个目录中,否则它只会从它保存到的目录中加载经过训练的模型并打印Test LossTest Accuracy

def getData():
    (X_train, y_train), (X_test, y_test) = mnist.load_data()
    y_train = to_categorical(y_train, num_classes=10)
    y_test = to_categorical(y_test, num_classes=10)
    X_train = X_train.reshape(X_train.shape[0], 784)
    X_test = X_test.reshape(X_test.shape[0], 784)
    
    # normalizing the data to help with the training
    X_train /= 255
    X_test /= 255
    
 
    return X_train, y_train, X_test, y_test

def trainModel(X_train, y_train, X_test, y_test):
    # training parameters
    batch_size = 1
    epochs = 10
    # create model and add layers
    model = Sequential()    
    model.add(Dense(64, activation='relu', input_shape=(784,)))
    model.add(Dense(10, activation = 'softmax'))

  
    # compiling the sequential model
    model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')
    # training the model and saving metrics in history
    history = model.fit(X_train, y_train,
          batch_size=batch_size, epochs=epochs,
          verbose=2,
          validation_data=(X_test, y_test))

    loss_and_metrics = model.evaluate(X_test, y_test, verbose=2)
    print("Test Loss", loss_and_metrics[0])
    print("Test Accuracy", loss_and_metrics[1])
    
    # Save model structure and weights
    model_json = model.to_json()
    with open('model.json', 'w') as json_file:
        json_file.write(model_json)
    model.save_weights('mnist_model.h5')
    return model

def loadModel():
    json_file = open('model.json', 'r')
    model_json = json_file.read()
    json_file.close()
    model = model_from_json(model_json)
    model.load_weights("mnist_model.h5")
    return model

X_train, y_train, X_test, y_test = getData()

if(not os.path.exists('mnist_model.h5')):
    model = trainModel(X_train, y_train, X_test, y_test)
    print('trained model')
    print(model.summary())
else:
    model = loadModel()
    print('loaded model')
    print(model.summary())
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
    loss_and_metrics = model.evaluate(X_test, y_test, verbose=2)
    print("Test Loss", loss_and_metrics[0])
    print("Test Accuracy", loss_and_metrics[1])
   

Here is the output (assuming model was trained earlier and this time model will just be loaded):这是输出(假设模型之前训练过,这次模型将被加载):

('Test Loss', 1.741784990310669) ('测试损失',1.741784990310669)

('Test Accuracy', 0.414) ('测试精度',0.414)

predict.py , on the other hand, predicts a handwritten number: predict.py,在另一方面,预计手写号码:

def loadModel():
    json_file = open('model.json', 'r')
    model_json = json_file.read()
    json_file.close()
    model = model_from_json(model_json)
    model.load_weights("mnist_model.h5")
    return model

model = loadModel()

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())

(X_train, y_train), (X_test, y_test) = mnist.load_data()
y_test = to_categorical(y_test, num_classes=10)
X_test = X_test.reshape(X_test.shape[0], 28*28)


loss_and_metrics = model.evaluate(X_test, y_test, verbose=2)

print("Test Loss", loss_and_metrics[0])
print("Test Accuracy", loss_and_metrics[1])

In this case, to my surprise, getting the following result:在这种情况下,令我惊讶的是,得到以下结果:

('Test Loss', 1.8380377866744995) ('测试损失',1.8380377866744995)

('Test Accuracy', 0.8856) ('测试精度',0.8856)

In the second file, I am getting a Test Accuracy of 0.88 (more than double that I was getting before).在第二个文件中,我得到了 0.88 的Test Accuracy (是我之前得到的两倍多)。

Also, model.summery() is the same in both of the files:此外, model.summery()在这两个文件中是相同的:

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_1 (Dense)              (None, 64)                50240     
_________________________________________________________________
dense_2 (Dense)              (None, 10)                650       
=================================================================
Total params: 50,890
Trainable params: 50,890
Non-trainable params: 0
_________________________________________________________________

I can't figure out the reason behind this behavior.我无法弄清楚这种行为背后的原因。 Is it normal?正常吗? Or am I missing something?或者我错过了什么?

The discrepancy results from the fact that one time you are calling evaluate() method with normalized data (ie divided by 255) and the other time (ie in "predict.py" file) you are calling it with un-normalized data.造成这种差异的原因是,一次您使用标准化数据(即除以 255)调用evaluate()方法,而另一次(即在“predict.py”文件中)您使用非标准化数据调用它。 In inference time (ie test time) you should always use the same pre-processing step you have used for the training data.在推理时间(即测试时间)中,您应该始终使用与训练数据相同的预处理步骤。

Further, first convert the data to floating point and then divide it by 255 (otherwise, with / , a true division is done in Python 2.x and in Python 3.x you would get errors when running X_train /= 255 and X_test /= 255 ):此外,首先将数据转换为浮点数,然后将其除以 255(否则,使用/ ,在 Python 2.x 和 Python 3.x 中进行真正的除法,您会在运行X_train /= 255X_test /= 255 ):

X_train = X_train.astype('float32')
X_test = X_test.astype('float32')

X_train /= 255.
X_test /= 255.

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM