简体   繁体   English

Tensorflow无法正确解码图像

[英]Tensorflow not properly decoding an image

I am new to tensorflow. 我是tensorflow的新手。 I am reading images from files and decoding them with tf.image.decode_jpeg and then I am plotting decoded image with matplotlib. 我正在从文件中读取图像,并使用tf.image.decode_jpeg对其进行解码,然后使用matplotlib绘制解码图像。 But somehow original and decoded images are different. 但是原始图像和解码图像有所不同。

这是原始图像

这是用matplotlib绘制的解码图像

filenames = ['/Users/darshak/TensorFlow/100.jpg', '/Users/darshak/TensorFlow/10.jpg']
filename_queue = tf.train.string_input_producer(filenames)

reader = tf.WholeFileReader()
filename, content = reader.read(filename_queue)

image = tf.image.decode_jpeg(content, channels=3)

image = tf.cast(image, tf.float32)

resized_image = tf.image.resize_images(image, [256, 256])

image_batch = tf.train.batch([resized_image], batch_size=9)

sess = tf.InteractiveSession()

coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)

plt.imshow(image.eval())
plt.show()
sess.close()

The problem arises because plt.imshow(image.eval()) interprets the image data different depending on the element type of image . 出现问题是因为plt.imshow(image.eval())根据image的元素类型对image数据的解释不同。

  • If image is a tf.uint8 tensor (ie as it is produced by tf.image.decode_jpeg() ) it will contain values from 0 to 255 for the R, G, and B channels, and plt.imshow() interprets (0, 0, 0) as black and (255, 255, 255) as white. 如果imagetf.uint8张量(即它是由tf.image.decode_jpeg()产生的),它将包含0255的R,G和B通道值,并且plt.imshow()解释为(0, 0, 0)为黑色, (255, 255, 255)为白色。

  • When you cast image to be a tf.float32 tensor, it will contain values from 0.0 to 255.0 for the R, G, and B channels, and plt.imshow() interprets (0.0, 0.0, 0.0) as black, but it interprets (1.0, 1.0, 1.0) as white. 当将image投射为tf.float32张量时,其R,G和B通道的值将介于0.0255.0 ,并且plt.imshow()会将(0.0, 0.0, 0.0) plt.imshow()解释为黑色,但它将解释为(1.0, 1.0, 1.0)为白色。 All values greater than 1.0 are treated the same as 1.0 , and as a result the image appears discolored. 所有大于1.0值都将被视为与1.0相同,因此图像看起来会变色。

If you intend to represent the image as a tf.float32 tensor and visualize it, you should divide the image values by 255.0 : 如果打算将图像表示为tf.float32张量并将其可视化,则应将图像值除以255.0

image = tf.image.decode_jpeg(content, channels=3)
image = tf.cast(image, tf.float32) / 255.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM