简体   繁体   English

TensorFlow 等效于 PyTorch 的 transforms.Normalize()

[英]TensorFlow equivalent of PyTorch's transforms.Normalize()

I'm trying to inference a TFLite model that was originally built in PyTorch.我正在尝试推断最初在 PyTorch 中构建的 TFLite model。 I have been following along the lines of the PyTorch implementation and have to preprocess images along the RGB channels.我一直遵循PyTorch 实现的思路,并且必须沿 RGB 通道预处理图像。 I found the closest TensorFlow equivalent of transforms.Normalize() to be tf.image.per_image_standardization() ( documentation ).我发现与transforms.Normalize()最接近的 TensorFlow 是tf.image.per_image_standardization()文档)。 Although this is a pretty good match, tf.image.per_image_standardization() does this by taking mean and std across the channels and applies it to them.尽管这是一个很好的匹配, tf.image.per_image_standardization()通过在通道中获取均值和标准并将其应用于它们来做到这一点。 Here's their full implementation from here这是他们从这里开始的完整实现

def per_image_standardization(image):
  """Linearly scales `image` to have zero mean and unit norm.
  This op computes `(x - mean) / adjusted_stddev`, where `mean` is the average
  of all values in image, and
  `adjusted_stddev = max(stddev, 1.0/sqrt(image.NumElements()))`.
  `stddev` is the standard deviation of all values in `image`. It is capped
  away from zero to protect against division by 0 when handling uniform images.
  Args:
    image: 3-D tensor of shape `[height, width, channels]`.
  Returns:
    The standardized image with same shape as `image`.
  Raises:
    ValueError: if the shape of 'image' is incompatible with this function.
  """
  image = ops.convert_to_tensor(image, name='image')
  _Check3DImage(image, require_static=False)
  num_pixels = math_ops.reduce_prod(array_ops.shape(image))

  image = math_ops.cast(image, dtype=dtypes.float32)
  image_mean = math_ops.reduce_mean(image)

  variance = (math_ops.reduce_mean(math_ops.square(image)) -
              math_ops.square(image_mean))
  variance = gen_nn_ops.relu(variance)
  stddev = math_ops.sqrt(variance)

  # Apply a minimum normalization that protects us against uniform images.
  min_stddev = math_ops.rsqrt(math_ops.cast(num_pixels, dtypes.float32))
  pixel_value_scale = math_ops.maximum(stddev, min_stddev)
  pixel_value_offset = image_mean

  image = math_ops.subtract(image, pixel_value_offset)
  image = math_ops.div(image, pixel_value_scale)
  return image

whereas PyTorch's transforms.Normalize() allows us to mention the mean and std to be applied across each channel like below.而 PyTorch 的transforms.Normalize()允许我们提及要在每个通道中应用的平均值和标准差,如下所示。

# transformation
    pose_transform = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406],
                             std=[0.229, 0.224, 0.225]),
    ])

What would be a way to get this functionality in TensorFlow 2.x?在 TensorFlow 2.x 中获得此功能的方法是什么?

Edit: I created a quick botch that seems to solve this by defining a function as such:编辑:我创建了一个快速的 botch,似乎通过定义 function 来解决这个问题:

def normalize_image(image, mean, std):
    for channel in range(3):
        image[:,:,channel] = (image[:,:,channel] - mean[channel])/std[channel]
    
    return image

I'm not sure how efficient this is but seems to get the job done.我不确定这有多有效,但似乎可以完成工作。 I still have to convert the output to a tensor before inputing to the model.在输入到 model 之前,我仍然需要将 output 转换为张量。

The workaround that you mentioned seems ok.您提到的解决方法似乎没问题。 But using for...loop to compute normalization to each RGB channel for a single image can be a bit problematic when you deal with a large dataset in the data pipeline ( generator or tf.data ).但是,当您处理数据管道( generatortf.data )中的大型数据集时,使用for...loop计算单个图像每个 RGB通道的归一化可能会有点问题。 But it's ok anyway.但无论如何没关系。 Here is the demonstration of your approach, and later we will provide two possible alternatives that might work for you easily.这是您的方法的演示,稍后我们将提供两种可能的替代方案,它们可能很容易为您工作。

from PIL import Image 
from matplotlib.pyplot import imshow, subplot, title, hist

# load image (RGB)
img = Image.open('/content/9.jpg')

def normalize_image(image, mean, std):
    for channel in range(3):
        image[:,:,channel] = (image[:,:,channel] - mean[channel]) / std[channel]
    return image

OP_approach = normalize_image(np.array(img) / 255.0, 
                            mean=[0.485, 0.456, 0.406], 
                            std=[0.229, 0.224, 0.225])

Now, let's observe the transform properties afterward.现在,让我们随后观察变换属性。

plt.figure(figsize=(25,10))
subplot(121); imshow(OP_approach); title(f'Normalized Image \n min-px: \
    {OP_approach.min()} \n max-pix: {OP_approach.max()}')
subplot(122); hist(OP_approach.ravel(), bins=50, density=True); \ 
                                    title('Histogram - pixel distribution')

在此处输入图像描述

The range of minimum and maximum pixel after normalization are ( -2.1179039301310043 , 2.6399999999999997 ) respectively.归一化后的最小和最大像素范围分别为( -2.11790393013100432.6399999999999997 )。

Option 2选项 2

We can use the tf.我们可以使用tf. keras...Normalization preprocessing layer to do the same. keras...归一化预处理层做同样的事情。 It takes two important arguments which are mean and, variance (square of the std ).它需要两个重要的 arguments ,它们是meanvariancestd的平方)。

from tensorflow.keras.experimental.preprocessing import Normalization

input_data = np.array(img)/255
layer = Normalization(mean=[0.485, 0.456, 0.406], 
                      variance=[np.square(0.299), 
                                np.square(0.224), 
                                np.square(0.225)])

plt.figure(figsize=(25,10))
subplot(121); imshow(layer(input_data).numpy()); title(f'Normalized Image \n min-px: \
   {layer(input_data).numpy().min()} \n max-pix: {layer(input_data).numpy().max()}')
subplot(122); hist(layer(input_data).numpy().ravel(), bins=50, density=True);\
   title('Histogram - pixel distribution')

在此处输入图像描述

The range of minimum and maximum pixel after normalization are ( -2.0357144 , 2.64 ) respectively.归一化后的最小和最大像素2.64分别为( -2.0357144 )。

Option 3选项 3

This is more like subtracting the average mean and divide by the average std .这更像是减去平均mean并除以平均std

norm_img = ((tf.cast(np.array(img), tf.float32) / 255.0) - 0.449) / 0.226

plt.figure(figsize=(25,10))
subplot(121); imshow(norm_img.numpy()); title(f'Normalized Image \n min-px: \
{norm_img.numpy().min()} \n max-pix: {norm_img.numpy().max()}')
subplot(122); hist(norm_img.numpy().ravel(), bins=50, density=True); \
title('Histogram - pixel distribution')

在此处输入图像描述

The range of minimum and maximum pixel after normalization are ( -1.9867257 , 2.4380531 ) respectively.归一化后的最小和最大像素2.4380531 -1.9867257 Lastly, if we compare to the pytorch way, there is not that much difference among these approaches.最后,如果我们与pytorch方式进行比较,这些方法之间并没有太大区别。

import torchvision.transforms as transforms

transform_norm = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                            std=[0.229, 0.224, 0.225]),
])
norm_pt = transform_norm(img)

plt.figure(figsize=(25,10))
subplot(121); imshow(np.array(norm_pt).transpose(1, 2, 0));\
  title(f'Normalized Image \n min-px: \
  {np.array(norm_pt).min()} \n max-pix: {np.array(norm_pt).max()}')
subplot(122); hist(np.array(norm_pt).ravel(), bins=50, density=True); \
  title('Histogram - pixel distribution')

在此处输入图像描述

The range of minimum and maximum pixel after normalization are ( -2.117904 , 2.64 ) respectively.归一化后的最小和最大像素2.64分别为( -2.117904 )。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM