简体   繁体   English

在MNIST张量流CNN上测试单个图像

[英]Testing a single image on the MNIST tensorflow CNN

I've already trained a CNN based on the classic MNIST dataset, and what i'm trying to do is build a program that loads the saved model, takes an image (a handwritten digit not part of the dataset) and can predict what digit is written and output it. 我已经基于经典的MNIST数据集训练了CNN,我正在尝试做的是构建一个程序来加载保存的模型,拍摄图像(手写数字不属于数据集)并可以预测什么数字被写入并输出。 I've found myself stuck trying to get the code to output the guess of what digit it is(0-9). 我发现自己一直在尝试获取代码以输出猜测是什么数字(0-9)。

I've already worked out how to feed in a single image in, what exactly do i write to find out what single class the model has classified the image as? 我已经解决了如何输入单个图像的问题,我究竟该写什么来找出模型将图像分类为哪个单个类?

Thank you 谢谢

num_channels = 1
image_size = 28
pic_root = #insert file directory here

img=mpimg.imread(pic_root)
image = img.reshape(-1,image_size,image_size, num_channels)
img = tf.cast(image, tf.float32)

with tf.Session() as session:
    saver = tf.train.import_meta_graph(save_file) #loading the saved model
    image_predict = tf.nn.softmax(img)
    print(image_predict)
    soft_max = tf.nn.softmax(logits, name="softmax_tensor")
    arg_max = tf.argmax(input=logits, axis=1)
    print(arg_max)
    print(soft_max)

image_predict, soft_max and arg_max all return something but i don't know how to get the actual prediction from this. image_predict,soft_max和arg_max都返回一些内容,但我不知道如何从中获取实际的预测。

Per your code, soft_max and arg_max should be returned as arrays of length 10. The values in the array correspond to the 10 digits from 0 to 9. 根据您的代码,应将soft_max和arg_max作为长度为10的数组返回。该数组中的值对应于从0到9的10位数字。

Each value in soft_max array is the probability that the input image matches that particular digit. soft_max数组中的每个值都是输入图像与该特定数字匹配的概率。 The array indexing starts at zero so the first value in array is the probability that the image is the digit 0 and the second value is the probability that the image is the digit 1 and so forth so that the tenth value is the probability that the image is 9. The predicted image is the value with the highest probability. 数组索引从零开始,因此数组中的第一个值是图像是数字0的概率,第二个值是图像是数字1的概率,依此类推,因此第十个值是图像是数字的概率是9。预测图像是概率最高的值。

arg_max saves you a processing step by returning a sorted list of predicted indices (which in this case match the predicted digit). arg_max通过返回预测索引的排序列表(在这种情况下与预测数字匹配)来节省您的处理步骤。 So if arg_max has [4, 9, 5, ...], the predicted digit is 4. 因此,如果arg_max具有[4,9,5,...],则预测数字为4。

print(arg_max[0]) #should give you the predicted digit

Once you know that the return values are arrays as explained, the output can be decoded. 一旦您知道返回值是所解释的数组,就可以对输出进行解码。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM