简体   繁体   中英

Inconsistency between image resizing with Keras (PIL) and TensorFlow?

I am bugged by an apparent inconsistency between:

  1. image resizing functionalities from keras.preprocessing , which are wrappers around PIL functions
  2. image resizing functions in TensorFlow's tf.image .

I am training a deep learning model for a computer vision task with Keras (actually, with tf.keras , but that doesn't matter here). I then serve the model with TF Serving, which requires me to send images to the model as encoded byte-strings, where they are decoded using tf.image.decode_png before going through the model graph.

The problem occurs when I resize the images. Resizing with bilinear interpolation (or any other method) gives different results with PIL compared to with tf.image , to such extent that the classification by the model changes depending on which function I use.

The code below provides a reproducible example.

import numpy as np 
from PIL import Image
from keras.preprocessing.image import load_img, img_to_array
import tensorflow as tf

# Generate an 'image' with numpy, save as png
np.random.seed(42)
image = np.random.randint(0, 255, size=(256, 256, 3)).astype(np.uint8)
Image.fromarray(image).convert("RGB").save('my_image.png')

Now, let's load the image in two ways. First with the PIL wrappers from Keras, as during the model training, then encoded as a binary string and decoded with TensorFlow functions, as in my model server.

# Using Keras PIL wrappers
keras_image = img_to_array(load_img('./my_image.png')) 

# Using TF functionalities
with tf.Session() as sess:
    with open('./my_image.png', 'rb') as f:
        tf_image_ = tf.image.decode_png(f.read())
    tf_image = sess.run(tf_image_)

So far so good, as both images are exactly the same (apart from the dtype, as Keras has casted the image to float32):

# Assert equality
np.array_equal(keras_image, tf_image)
> True

Repeating this code with resizing however gives a different result:

# Using Keras PIL wrappers, with resizing
keras_image_rs = img_to_array(load_img('./my_image.png',
                             target_size=(224, 224),
                             interpolation='bilinear'))

# Using TF functionalities, with resizing
with tf.Session() as sess:
    with open('./my_image.png', 'rb') as f:
        tf_image_ = tf.image.decode_png(f.read())
        # Add and remove dimension
        # As tf.image.resize_* requires a batch dimension
        tf_image_ = tf.expand_dims(tf_image_, 0)
        tf_image_ = tf.image.resize_bilinear(tf_image_,
                                            [224, 224], 
                                             align_corners=True)
        tf_image_ = tf.squeeze(tf_image_, axis=[0])

    tf_image_rs = sess.run(tf_image_)

# Assert equality
np.array_equal(keras_image_rs, tf_image_rs)
> False

The mean absolute difference between the two images is non-negligible:

np.mean(np.abs(keras_image_rs - tf_image_rs))
7.982703

I played with the align_corners argument, and tried other available interpolation methods as well. None give the same output as when resizing the image with PIL. This is quite annoying as it gives me a skew between training and testing results. Does anyone have an idea as to what causes this behavior, or on how to fix it?

The behavior described completely matches what's written in this article: How Tensorflow's tf.image.resize stole 60 days of my life

In short: yes, PIL/sklearn/OpenCV and other common libraries for image manipulation have the correct behavior, while tf.image.resize has a different behavior that won't be changed in order to not break old trained models.

Hence, you should always preprocess your image using the same library outside the computational graph.

Link to the relevant github thread: https://github.com/tensorflow/tensorflow/issues/6720

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM