简体   繁体   English

使用python和tensorflow从Image中识别数字

[英]Number recognition from Image with python and tensorflow

Details: Ubuntu 14.04(LTS), OpenCV 2.4.13, Spyder 2.3.9(Python 2.7), Tensorflow r0.10 详细信息:Ubuntu 14.04(LTS),OpenCV 2.4.13,Spyder 2.3.9(Python 2.7),Tensorflow r0.10

I want to recognize Number from the image with Python and Tensorflow (optional OpenCV ). 我想用PythonTensorflow (可选的OpenCV )从图像中识别Number。

Additionally, I want to use MNIST data training with tensorflow 另外,我想使用具有张量流的MNIST数据训练

Like this(the code is referred to this page 's video), 像这样(代码被引用到这个页面的视频),

Code: 码:

import tensorflow as tf
import random

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

x = tf.placeholder("float", [None, 784])
y = tf.placeholder("float", [None, 10])

W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

learning_rate = 0.01
training_epochs = 25
batch_size = 100
display_step = 1

### modeling ###

activation = tf.nn.softmax(tf.matmul(x, W) + b)

cross_entropy = tf.reduce_mean(-tf.reduce_sum(y * tf.log(activation), reduction_indices=1))

optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cross_entropy)

init = tf.initialize_all_variables()

sess = tf.Session()
sess.run(init)

### training ###

for epoch in range(training_epochs) :

    avg_cost = 0
    total_batch = int(mnist.train.num_examples/batch_size)

    for i in range(total_batch) :

        batch_xs, batch_ys =mnist.train.next_batch(batch_size)
        sess.run(optimizer, feed_dict={x: batch_xs, y: batch_ys})
        avg_cost += sess.run(cross_entropy, feed_dict = {x: batch_xs, y: batch_ys}) / total_batch

    if epoch % display_step == 0 :
        print "Epoch : ", "%04d" % (epoch+1), "cost=", "{:.9f}".format(avg_cost)

print "Optimization Finished"

### predict number ###

r = random.randint(0, mnist.test.num_examples - 1)
print "Prediction: ", sess.run(tf.argmax(activation,1), {x: mnist.test.images[r:r+1]})
print "Correct Answer: ", sess.run(tf.argmax(mnist.test.labels[r:r+1], 1))

But, the problem is how can I make numpy array like 但是,问题是如何使numpy数组像

Code addition: 代码添加:

mnist.test.images[r:r+1]

[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 0.50196081 0.50196081 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 1. 1. 1. 1. 0.50196081 0.25098041 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0.25098041 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.74901962 1. 1. 1. 1. 0.50196081 0.50196081 0.50196081 0.74901962 1. 1. 1. 0.74901962 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 1. 0.74901962 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 0.74901962 0. 0. 0. 0. 0. 0. 0. 0. [[0. 0. 0. 0 0. 0 0. 0 0. 0 0. 0 0. 0 0. 0 0. 0 0 0. 0 0. 0 0。 0. 0-0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0 0. 0-0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0 0. 0-0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0 0. 0-0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0 0. 0-0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0 0. 0 0.50196081 0.50196081 0.50196081 0.50196081 0. 0 0. 0 0. 0 0. 0 0. 0 0. 0 0. 0 0. 0 0. 0 0 0. 0 0. 0.50196081 1. 1. 1. 1. 1. 1. 0.50196081 0.25098041 0. 0 0. 0 0. 0 0. 0 0. 0 0. 0 0. 0 0. 0 0 .0.50196081 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0.25098041 0. 0 0. 0 0. 0 0. 0 0. 0 0. 0 0 .0.0 0. 0.74901962 1. 1. 1. 1. 0.50196081 0.50196081 0.50196081 0.74901962 1. 1. 1. 0.74901962 0. 0 0. 0 0. 0 0. 0 0. 0 0. 0 0 0. 0. 0.50196081 1. 1. 1. 0.74901962 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 0.74901962 0. 0. 0. 0 0. 0 0. 0 0 0. 0. 0. 0. 0. 1. 1. 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0.25098041 1. 1. 0.74901962 0.25098041 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.74901962 1. 1. 0.74901962 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.25098041 1. 1. 0.74901962 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 0.74901962 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.25098041 1. 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 0.25098041 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.25098041 1. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.25098041 1. 1. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.74901962 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.74901962 1. 1. 1. 0.25098041 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.25098041 0.74901962 1. 1. 1. 1. 0.74901962 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 1. 1. 0.74901962 0. 0. 0. 0. 0. 0.25098041 0.50196081 1. 1. 1. 1. 1. 1. 0.50196081 0. 0. 0. 0. 0. 0 0. 0 0. 0 0. 0.1 1. 1. 0.50196081 0. 0 0. 0 0. 0 0. 0. 0.25098041 1. 1. 0.74901962 0.25098041 0. 0 0. 0 0。 0. 0. 0. 0. 0 0 0.:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 0. 0-0.00.00196081 1.1。0.74901962 0. 0-0.0 0. 0 0. 0 0. 0 0.205098041 1. 1. 0.50196081 0. 0。 0. 0 0. 0 0. 0. 0. 0.50196081 1. 1. 0.25098041 0. 0 0. 0 0. 0 0. 0 0. 0 0. 0 1. 1. 0.50196081 0 0 .0.0.0.0.0-0.1.1.1.1.0.0.0.0.0.0.0.0.0.0.0.0,0.05098041 1。 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 0.50196081 0. 0 0. 0 0. 0 0. 0 0. 0 0. 0.25098041 1. 1. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.74901962 1. 0.50196081 0. 0. 0 0. 0 0. 0 0. 0 0. 0 0. 0.74901962 1. 1. 1. 0.25098041 0. 0 0. 0 0. 0 0. 0.50196081 1. 1. 0 0. 0 0. 0 0. 0 0. 0.25098041 0.74901962 1. 1. 1. 1. 0.74901962 0. 0. 0 0. 0 0. 0. 0. 0. 0 0 0 0 0 0 0 0 0 1. 1. 1. 1. 1. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 1. 1. 1. 1. 0.50196081 0. 0. 0. 0. 0 0 . 0. 0. 0. 0. 0. 0.74901962 1. 1. 1. 1. 0.50196081 0.50196081 0.74901962 1. 1. 1. 1. 1. 1. 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.74901962 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.25098041 1. 1. 1. 1. 1. 1. 1. 0.50196081 0.25098041 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.50196081 0.50196081 0.50196081 0.50196081 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. ]] 0. 0. 0. 0. 0. 0. 0.::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 0 0. 0-0.0 0. 0 0.74901962 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0.50196081 0. 0 0. 0 0. 0。 0. 0 0. 0 0. 0 0. 0 0. 0 0.25098041 1. 1. 1. 1. 1. 1. 1. 0.50196081 0.25098041 0. 0 0. 0 0. 0 0 .0.0.0.0.0.0.0.0.0.0,0.00.00196081 0.50196081 0.50196081 0.50196081 0. 0 0. 0 0. 0 0. 0 0 0 .0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.00.0 .0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.00.0 .0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.00.0 .0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.00.0 .0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.00.0 .0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0。

When I use OpenCV to solve the probelm, I can make numpy array about the Image but little bit strange. 当我使用OpenCV解决问题时,我可以制作关于图像的numpy数组,但有点奇怪。 (I want to make array into a vector of 28x28) (我想把数组变成28x28的向量)

Code addition: 代码添加:

image = cv2.imread("img_easy.jpg")
resized_image = cv2.resize(image, (28, 28))

[[[255 255 255] [255 255 255] [255 255 255] ..., [255 255 255] [255 255 255] [255 255 255]] [[[255 255 255] [255 255 255] [255 255 255] ...,[255 255 255] [255 255 255] [255 255 255]]

[[255 255 255] [255 255 255] [255 255 255] ..., [255 255 255] [255 255 255] [255 255 255]] [[255 255 255] [255 255 255] [255 255 255] ...,[255 255 255] [255 255 255] [255 255 255]]

[[255 255 255] [255 255 255] [255 255 255] ..., [255 255 255] [255 255 255] [255 255 255]] [[255 255 255] [255 255 255] [255 255 255] ...,[255 255 255] [255 255 255] [255 255 255]]

..., ...

[[255 255 255] [255 255 255] [255 255 255] ..., [255 255 255] [255 255 255] [255 255 255]] [[255 255 255] [255 255 255] [255 255 255] ...,[255 255 255] [255 255 255] [255 255 255]]

[[255 255 255] [255 255 255] [255 255 255] ..., [255 255 255] [255 255 255] [255 255 255]] [[255 255 255] [255 255 255] [255 255 255] ...,[255 255 255] [255 255 255] [255 255 255]]

[[255 255 255] [255 255 255] [255 255 255] ..., [255 255 255] [255 255 255] [255 255 255]]] [[255 255 255] [255 255 255] [255 255 255] ...,[255 255 255] [255 255 255] [255 255 255]]]

And then, I put the value('resized_image') into the Tensorflow code. 然后,我将值('resized_image')放入Tensorflow代码中。 Like this, 像这样,

Code modification: 代码修改:

### predict number ###

print "Prediction: ", sess.run(tf.argmax(activation,1), {x: resized_image})
print "Correct Answer: 9"

As a result, the error is occured at this line. 结果,在该行发生错误。

ValueError: Cannot feed value of shape (28, 28, 3) for Tensor u'Placeholder_2:0', which has shape '(?, 784)' ValueError:无法为Tensor u'Placeholder_2:0'提供形状值(28,28,3),其形状为'(?,784)'

Finally, 最后,

1) I want to know how can I make a data which can be input the tensorflow code(maybe numpy array [784]) 1)我想知道如何制作可以输入tensorflow代码的数据(也许是numpy数组[784])

2) Do you know about the number recognition examples that use tensorflow? 2)您是否了解使用tensorflow的数字识别示例?

I'm a beginner in machine-learning. 我是机器学习的初学者。

Please Tell me in detail what should I do. 请详细告诉我该怎么做。

It seems like the image you are using is RGB hence the 3rd dimension (28,28,3). 看起来你正在使用的图像是RGB,因此是第三维(28,28,3)。

Where as the original MNIST images are greyscale with width and height of 28. This is why the shape of your x placeholder is [None, 784] because 28*28= 784. 其中原始MNIST图像是灰度,宽度和高度为28.这就是为什么x占位符的形状为[None,784],因为28 * 28 = 784。

CV2 is reading the image in as RGB and you want it to be greyscale ie (28,28) When doing your imread you may find it helpful to use this. CV2正在以RGB格式读取图像,你希望它是灰度级的,即(28,28)在做你的imread时你会发现使用它有帮助。

image = cv2.imread("img_easy.jpg", cv2.CV_LOAD_IMAGE_GRAYSCALE)

By doing this you image should have the correct shape (28, 28). 通过这样做,您的图像应该具有正确的形状(28,28)。

Also CV2 image values are not in the same range as the MNIST images that are being shown in your question. 此外,CV2图像值与您的问题中显示的MNIST图像的范围不同。 You may have to normalize the values in the image so that they are in the range 0-1. 您可能必须规范化图像中的值,使它们在0-1范围内。

Also you might want to use a CNN (slightly more advanced but should give better results) for this. 此外,你可能想要使用CNN(略高一些,但应该给出更好的结果)。 See the tutorial on this page https://www.tensorflow.org/tutorials/ for more details. 有关详细信息,请参阅此页面上的教程https://www.tensorflow.org/tutorials/

Have you tried this one ? 你试过这个吗? I had the same problem and this was very helpful 我有同样的问题,这非常有帮助

resized = cv2.resize(image, dsize = (28,28), interpolation = cv2.INTER_CUBIC)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM