简体   繁体   English

tensorflow image_resize使未知图像大小的图像混乱

[英]tensorflow image_resize mess up image on unknown image size

I have a list of variable size image and wish to standardise them into 256x256 size. 我有一个可变尺寸图像列表,希望将它们标准化为256x256尺寸。 I used the following code 我用下面的代码

import tensorflow as tf
import matplotlib.pyplot as plt

file_contents = tf.read_file('image.jpg')
im = tf.image.decode_jpeg(file_contents)
im = tf.image.resize_images(im, 256, 256)

sess = tf.Session()
sess.run(tf.initialize_all_variables())

img = sess.run(im)

plt.imshow(img)
plt.show()

However, tf.resize_images() tend to mess up the image. 但是, tf.resize_images()会弄乱图像。 However, using tf.reshape() seems to allow resize_image() function correctly 但是,使用tf.reshape()似乎允许正确的resize_image()函数

Tensorflow version : 0.8.0 Tensorflow版本:0.8.0

Original Image: 原始图片: 在此处输入图片说明

Resized Image: 调整大小的图片: 在此处输入图片说明

I know skimage package can handle what I need, however I wish to enjoy the function from tf.train.shuffle_batch() . 我知道skimage包可以满足我的需求,但是我希望可以使用tf.train.shuffle_batch()的函数。 I try to avoid maintaining 2 identical dataset ( with 1 fixed image size ) since Caffe seems to have no problem handling them. 我尝试避免维护2个相同的数据集(具有1个固定的图像大小),因为Caffe似乎在处理它们方面没有问题。

This happens because image_resize() is performing an interpolation between adjacent pixels, and returning floats instead of integers in the range 0-255. 发生这种情况是因为image_resize()在相邻像素之间执行插值,并返回浮点数而不是0-255范围内的整数。 That's why NEAREST_NEIGHBOR does work: it takes the value of one of the near pixels without doing further math. 这就是NEAREST_NEIGHBOR起作用的原因:它无需进行进一步的数学运算即可获取近像素之一的值。 Suppose you have some adjacent pixels with values 240, 241. NEAREST_NEIGHBOR will return either 240 or 241. With any other method, the value could be something like 240.5, and is returned without rounding it, I assume intentionally so you can decide what is better for you (floor, round up, etc). 假设您有一些相邻的像素,其值分别为240、241。NEAREST_NEIGHBOR将返回240或241。使用任何其他方法,该值都可能类似于240.5,并且在不舍入的情况下将其返回,我想这是有意的,因此您可以决定哪种方法更好为您服务(地板,四舍五入等)。 The plt.imshow() on the other side, when facing float values, interprets only the decimal part, as if they were pixel values in a full scale between 0.0 and 1.0. 另一面的plt.imshow()面对浮点值时,仅解释小数部分,就好像它们是0.0到1.0之间的满刻度像素值。 To make the above code work, one of the possible solutions would be: 为了使上述代码有效,可能的解决方案之一是:

import numpy as np
plt.imshow(img.astype(np.uint8))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM