在MNIST张量流CNN上测试单个图像

Question

I've already trained a CNN based on the classic MNIST dataset, and what i'm trying to do is build a program that loads the saved model, takes an image (a handwritten digit not part of the dataset) and can predict what digit is written and output it. 我已经基于经典的MNIST数据集训练了CNN，我正在尝试做的是构建一个程序来加载保存的模型，拍摄图像（手写数字不属于数据集）并可以预测什么数字被写入并输出。 I've found myself stuck trying to get the code to output the guess of what digit it is(0-9). 我发现自己一直在尝试获取代码以输出猜测是什么数字（0-9）。

I've already worked out how to feed in a single image in, what exactly do i write to find out what single class the model has classified the image as? 我已经解决了如何输入单个图像的问题，我究竟该写什么来找出模型将图像分类为哪个单个类？

Thank you 谢谢

num_channels = 1
image_size = 28
pic_root = #insert file directory here

img=mpimg.imread(pic_root)
image = img.reshape(-1,image_size,image_size, num_channels)
img = tf.cast(image, tf.float32)

with tf.Session() as session:
    saver = tf.train.import_meta_graph(save_file) #loading the saved model
    image_predict = tf.nn.softmax(img)
    print(image_predict)
    soft_max = tf.nn.softmax(logits, name="softmax_tensor")
    arg_max = tf.argmax(input=logits, axis=1)
    print(arg_max)
    print(soft_max)

image_predict, soft_max and arg_max all return something but i don't know how to get the actual prediction from this. image_predict，soft_max和arg_max都返回一些内容，但我不知道如何从中获取实际的预测。

Answer 1

Per your code, soft_max and arg_max should be returned as arrays of length 10. The values in the array correspond to the 10 digits from 0 to 9. 根据您的代码，应将soft_max和arg_max作为长度为10的数组返回。该数组中的值对应于从0到9的10位数字。

Each value in soft_max array is the probability that the input image matches that particular digit. soft_max数组中的每个值都是输入图像与该特定数字匹配的概率。 The array indexing starts at zero so the first value in array is the probability that the image is the digit 0 and the second value is the probability that the image is the digit 1 and so forth so that the tenth value is the probability that the image is 9. The predicted image is the value with the highest probability. 数组索引从零开始，因此数组中的第一个值是图像是数字0的概率，第二个值是图像是数字1的概率，依此类推，因此第十个值是图像是数字的概率是9。预测图像是概率最高的值。

arg_max saves you a processing step by returning a sorted list of predicted indices (which in this case match the predicted digit). arg_max通过返回预测索引的排序列表（在这种情况下与预测数字匹配）来节省您的处理步骤。 So if arg_max has [4, 9, 5, ...], the predicted digit is 4. 因此，如果arg_max具有[4，9，5，...]，则预测数字为4。

print(arg_max[0]) #should give you the predicted digit

Once you know that the return values are arrays as explained, the output can be decoded. 一旦您知道返回值是所解释的数组，就可以对输出进行解码。

在MNIST张量流CNN上测试单个图像

问题描述

1 个解决方案

解决方案1
0 2018-03-05 07:06:29

在MNIST张量流CNN上测试单个图像

问题描述

1 个解决方案

解决方案1 0 2018-03-05 07:06:29

解决方案1
0 2018-03-05 07:06:29