繁体   English   中英

如何在 OpenCV 或 NumPy 中复制 PyTorch 归一化?

[英]How to replicate PyTorch normalization in OpenCV or NumPy?

我需要在 OpenCV 或 NumPy 中复制 PyTorch 图像标准化。

快速背景故事:我正在做一个项目,我在 PyTorch 中进行培训,但由于部署到嵌入式设备,我将不得不在 OpenCV 中进行推理,而我没有存储空间来安装 PyTorch。在 PyTorch 中进行培训并保存一个PyTorch 图 然后我将转换为 ONNX 图。 为了在 OpenCV 中进行推理,我将图像打开为 OpenCV 图像(即 NumPy 数组),然后调整大小,然后依次调用cv2.normalizecv2.dnn.blobFromImagenet.setInputnet.forward

当在 PyTorch 中进行测试推理与在 OpenCV 中进行推理时,我得到的准确度结果略有不同,我怀疑这种差异是由于标准化过程在两者之间产生了略微不同的结果。

这是我放在一起的快速脚本,用于显示单个图像的差异。 请注意,我使用的是灰度(单通道)并且我正在标准化到 -1.0 到 +1.0 范围内:

# scratchpad.py

import torch
import torchvision

import cv2
import numpy as np
import PIL
from PIL import Image

TRANSFORM = torchvision.transforms.Compose([
    torchvision.transforms.Resize((224, 224)),
    torchvision.transforms.ToTensor(),
    torchvision.transforms.Normalize([0.5], [0.5])
])

def main():
    # 1st show PyTorch normalization

    # open the image as an OpenCV image
    openCvImage = cv2.imread('image.jpg', cv2.IMREAD_GRAYSCALE)
    # convert OpenCV image to PIL image
    pilImage = PIL.Image.fromarray(openCvImage)
    # convert PIL image to a PyTorch tensor
    ptImage = TRANSFORM(pilImage).unsqueeze(0)
    # show the PyTorch tensor info
    print('\nptImage.shape = ' + str(ptImage.shape))
    print('ptImage max = ' + str(torch.max(ptImage)))
    print('ptImage min = ' + str(torch.min(ptImage)))
    print('ptImage avg = ' + str(torch.mean(ptImage)))
    print('ptImage: ')
    print(str(ptImage))

    # 2nd show OpenCV normalization

    # resize the image
    openCvImage = cv2.resize(openCvImage, (224, 224))
    # convert to float 32 (necessary for passing into cv2.dnn.blobFromImage which is not show here)
    openCvImage = openCvImage.astype('float32')
    # use OpenCV version of normalization, could also do this with numpy
    cv2.normalize(openCvImage, openCvImage, 1.0, -1.0, cv2.NORM_MINMAX)
    # show results
    print('\nopenCvImage.shape = ' + str(openCvImage.shape))
    print('openCvImage max = ' + str(np.max(openCvImage)))
    print('openCvImage min = ' + str(np.min(openCvImage)))
    print('openCvImage avg = ' + str(np.mean(openCvImage)))
    print('openCvImage: ')
    print(str(openCvImage))

    print('\ndone !!\n')
# end function

if __name__ == '__main__':
    main()

这是我正在使用的测试图像:

在此处输入图像描述

这是我目前得到的结果:

$ python3 scratchpad.py 

ptImage.shape = torch.Size([1, 1, 224, 224])
ptImage max = tensor(0.9608)
ptImage min = tensor(-0.9686)
ptImage avg = tensor(0.1096)
ptImage: 
tensor([[[[ 0.0431, -0.0431,  0.1294,  ...,  0.8510,  0.8588,  0.8588],
          [ 0.0510, -0.0510,  0.0980,  ...,  0.8353,  0.8510,  0.8431],
          [ 0.0588, -0.0431,  0.0745,  ...,  0.8510,  0.8588,  0.8588],
          ...,
          [ 0.6157,  0.6471,  0.5608,  ...,  0.6941,  0.6627,  0.6392],
          [ 0.4902,  0.3961,  0.3882,  ...,  0.6627,  0.6471,  0.6706],
          [ 0.3725,  0.4039,  0.5451,  ...,  0.6549,  0.6863,  0.6549]]]])

openCvImage.shape = (224, 224)
openCvImage max = 1.0000001
openCvImage min = -1.0
openCvImage avg = 0.108263366
openCvImage: 
[[ 0.13725497 -0.06666661  0.20000008 ...  0.8509805   0.8666668
   0.8509805 ]
 [ 0.15294124 -0.06666661  0.09019614 ...  0.8274511   0.8431374
   0.8274511 ]
 [ 0.12156869 -0.06666661  0.0196079  ...  0.8509805   0.85882366
   0.85882366]
 ...
 [ 0.5843138   0.74117655  0.5450981  ...  0.83529425  0.59215695
   0.5764707 ]
 [ 0.6862746   0.34117654  0.39607853 ...  0.67843145  0.6705883
   0.6470589 ]
 [ 0.34117654  0.4117648   0.5215687  ...  0.5607844   0.74117655
   0.59215695]]

done !!

如您所见,结果相似但绝对不完全相同。

我怎样才能在 OpenCV 中进行规范化并使其与 PyTorch 规范化完全或几乎完全相同? 我已经在 OpenCV 和 NumPy 中尝试了各种选项,但无法得到比上面的结果更接近的结果,它们有很大的不同。

- 编辑 - - - - - - - - - - - - - -

为了回应伊万,我也试过这个:

# resize the image
openCvImage = cv2.resize(openCvImage, (224, 224))
# convert to float 32 (necessary for passing into cv2.dnn.blobFromImage which is not show here)
openCvImage = openCvImage.astype('float32')
mean = np.mean(openCvImage)
stdDev = np.std(openCvImage)
openCvImage = (openCvImage - mean) / stdDev
# show results
print('\nopenCvImage.shape = ' + str(openCvImage.shape))
print('openCvImage max = ' + str(np.max(openCvImage)))
print('openCvImage min = ' + str(np.min(openCvImage)))
print('openCvImage avg = ' + str(np.mean(openCvImage)))
print('openCvImage: ')
print(str(openCvImage))

结果是:

openCvImage.shape = (224, 224)
openCvImage max = 2.1724665
openCvImage min = -2.6999729
openCvImage avg = 7.298528e-09
openCvImage: 
[[ 0.07062991 -0.42616782  0.22349077 ...  1.809422    1.8476373
   1.809422  ]
 [ 0.10884511 -0.42616782 -0.04401573 ...  1.7520993   1.7903144
   1.7520993 ]
 [ 0.0324147  -0.42616782 -0.21598418 ...  1.809422    1.8285296
   1.8285296 ]
 ...
 [ 1.1597633   1.5419154   1.0642253  ...  1.7712069   1.178871
   1.1406558 ]
 [ 1.4081622   0.56742764  0.70118093 ...  1.3890547   1.3699471
   1.3126242 ]
 [ 0.56742764  0.7393961   1.0069026  ...  1.1024406   1.5419154
   1.178871  ]]

这类似于 PyTorch 规范化但显然不一样。

我试图在 OpenCV 中实现规范化,它产生与 PyTorch 规范化相同的结果。

我意识到,由于调整大小操作存在细微差异(可能还有非常小的舍入差异),我可能永远无法获得完全相同的标准化结果,但我希望尽可能接近 PyTorch 结果。

根据文档torchvision.transforms.Normalize()归一化 with meanstd 那是:

output[channel] = (input[channel] - mean[channel]) / std[channel]

在您的代码中

cv2.normalize(openCvImage, openCvImage, 1.0, -1.0, cv2.NORM_MINMAX)

是最小最大缩放。 它们是两种不同的归一化。 您可以使用以下方法简单地重建缩放:

openCvImage = (openCvImage - 0.5) / 0.5

@Quang Hoang已经解释了这些差异。 我只想补充一些细节。 function cv2.normalize执行最小最大缩放。 它将值从[min(data), max(data)]映射到提供的区间[a, b] ,这里是[-1, 1] 因此,它与计算data = (data-min(data))/(max(data)-min(data))*(ba)+a

这是在openCvImage上调用cv2.normalize之前之后

openCvImage-------------
shape = (224, 224)
min = 0.0
max = 255.0
avg = 141.2952

openCvImage-------------
shape = (224, 224)
min = -1.0
max = 1.0
avg = 0.10819771

所以cv2.normalize(openCvImage, openCvImage, 1.0, -1.0, cv2.NORM_MINMAX)等同于(openCvImage - openCvImage.min()) / (openCvImage.max() - openCvImage.min())*2 - 1


另一方面, torchvision.transforms.Normalize将执行移位比例变换: data = (data - mean)/std 然而,这可能有点令人困惑,因为mean不一定是输入数据的平均值(标准差也是如此)。 我希望您会注意到 PyTorch 张量的均值标准分别不是0.50.5

ptImage-------------
shape = torch.Size([224, 224])
avg = tensor(0.5548)
std = tensor(0.5548)

如果您希望标准化您的数据,即让mean=0std=1 ,您可以计算 z 分数(使用torchvision.transforms.Normalize )。 但是您只能通过首先测量数据的均值和标准来做到这一点。


另请注意torchvision.transforms.ToTensor确实执行了最小-最大操作:

如果 PIL 图像属于到其中一种模式(L、LA、P、I、F、RGB、YCbCr、RGBA、CMYK、1)或者如果 numpy.ndarray 具有 dtype = np.uint8

在上面提到的@Quang Hoang@Ivan的基础上,我遇到了类似的问题,并且通过对您的原始代码进行了一些修改取得了一些成功。 使用示例图像,我能够在 PyTorch 和 OpenCV 转换图像中获得类似的平均像素强度值(在 3% 以内)。 此外,当使用本地 ONNX model 进行测试时,由脚本编写的 PyTorch 和 OpenCV 图像给出了相同的预测和相似的置信度。

import torch
import torchvision

import cv2
import numpy as np
import PIL
from PIL import Image

TRANSFORM = torchvision.transforms.Compose([
    torchvision.transforms.Resize((224, 224)),
    torchvision.transforms.ToTensor(), # (H, W, C) [0, 255] -> (C, H, W) [0.0, 1.0]
    torchvision.transforms.Normalize([0.5], [0.5])
])

# 1st show PyTorch normalization
# open the image as an OpenCV image
openCvImage = cv2.imread('image.jpg')

# convert OpenCV image to PIL image
pilImage = PIL.Image.fromarray(openCvImage)

# convert PIL image to a PyTorch tensor, swap axes to format for imwrite
ptImageResize = np.array(TRANSFORM(pilImage)).swapaxes(0,2).swapaxes(0,1)

cv2.imshow('pytorch-transforms', ptImageResize)
cv2.imwrite('image-pytorch-transforms.jpg', ptImageResize)

# show the PyTorch tensor info
print('\nptImageResize.shape = ' + str(ptImageResize.shape))
print('ptImageResize max = ' + str(np.max(ptImageResize)))
print('ptImageResize min = ' + str(np.min(ptImageResize)))
print('ptImageResize avg = ' + str(np.mean(ptImageResize)))
print('ptImageResize: ')
print(str(ptImageResize))

# 2nd show OpenCV normalization
# resize the image
openCvImageResize = cv2.resize(openCvImage, (224, 224), interpolation=cv2.INTER_NEAREST)

# Rescale image from [0, 255] to [0.0, 1.0] as in the PyTorch ToTensor() method
# img = (img - mean) / stdDev
openCvImageResize = openCvImageResize / 255

# Normalize the image to mean and std
mean = [0.5]
std = [0.5]
openCvImageResize = (openCvImageResize - mean) / std

cv2.imshow('opencv-transforms', openCvImageResize)
cv2.imwrite('image-opencv-transforms.jpg', openCvImageResize)

# show results
print('\nopenCvImageResize.shape = ' + str(openCvImageResize.shape))
print('openCvImageResize max = ' + str(np.max(openCvImageResize)))
print('openCvImageResize min = ' + str(np.min(openCvImageResize)))
print('openCvImageResize avg = ' + str(np.mean(openCvImageResize)))
print('openCvImageResize: ')
print(str(openCvImageResize))
    
cv2.waitKey(0)
cv2.destroyAllWindows()

这可能会有所帮助
如果你看一下实际的实现

torchvision.transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225],
    )

下面的块是它的实际作用:

import numpy as np
from PIL import Image
MEAN = 255 * np.array([0.485, 0.456, 0.406])
STD = 255 * np.array([0.229, 0.224, 0.225])
img_pil = Image.open("ty.jpg")
x = np.array(img_pil)
x = x.transpose(-1, 0, 1)
x = (x - MEAN[:, None, None]) / STD[:, None, None]

在这里,我在图像上完成了

OpenCV 类型:

img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = img/255.0
mean=[0.485, 0.456, 0.406]
std=[0.229, 0.224, 0.225]

img[..., 0] -= mean[0]
img[..., 1] -= mean[1]
img[..., 2] -= mean[2]

img[..., 0] /= std[0]
img[..., 1] /= std[1]
img[..., 2] /= std[2]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM