将模型从 pytorch 转换为 ONNX 后得到不同的结果

Question

我正在使用以下代码将 googlenet 模型从 pytorch 转换为 onnx：

torch.onnx.export(model,               # model being run
                  input_batch,                         # model input (or a tuple for multiple inputs)
                  "google-net-onnx-test.onnx",   # where to save the model (can be a file or file-like object)
                  export_params=True,        # store the trained parameter weights inside the model file
                  opset_version=10,          # the ONNX version to export the model to
                  do_constant_folding=True,  # whether to execute constant folding for optimization
                  input_names = ['input'],   # the model's input names
                  output_names = ['output'], # the model's output names
                  dynamic_axes={'input' : {0 : 'batch_size'},    # variable length axes
                                'output' : {0 : 'batch_size'}})

当我在 pytorch 上为此图像运行模型时：

我得到了正确的结果：

Samoyed 0.9378381967544556
Pomeranian 0.00828344002366066
Great Pyrenees 0.005603068508207798
Arctic fox 0.005527767818421125
white wolf 0.004741032607853413

但是当我用 ONNX 做这件事时，我得到了这个：

每种情况的前置和后置处理代码不同，I 应该是等价的。

这是 Pytorch 中的完整代码：

import torch
from PIL import Image
from torchvision import transforms

model = torch.hub.load('pytorch/vision:v0.10.0', 'googlenet', pretrained=True)
model.eval()


input_image = Image.open(filename)
preprocess = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0) # create a mini-batch as expected by the model

# move the input and model to GPU for speed if available
if torch.cuda.is_available():
    input_batch = input_batch.to('cuda')
    model.to('cuda')



with torch.no_grad():
    output = model(input_batch)
# Tensor of shape 1000, with confidence scores over Imagenet's 1000 classes
#print(output[0])
# The output has unnormalized scores. To get probabilities, you can run a softmax on it.
probabilities = torch.nn.functional.softmax(output[0], dim=0)
print(probabilities[:2])

# Read the categories
with open("imagenet_classes.txt", "r") as f:
    categories = [s.strip() for s in f.readlines()]
# Show top categories per image
top5_prob, top5_catid = torch.topk(probabilities, 5)
for i in range(top5_prob.size(0)):
    print(categories[top5_catid[i]], top5_prob[i].item())

这是 ONNX 的代码

from PIL import Image
import imageio
import onnxruntime as ort
import numpy as np

import matplotlib.pyplot as plt
import numpy as np
from collections import namedtuple
import os
import time


def get_image(path):
    '''
        Using path to image, return the RGB load image
    '''
    img = imageio.imread(path, pilmode='RGB')
    return img

# Pre-processing function for ImageNet models using numpy
def preprocess(img):
    '''
    Preprocessing required on the images for inference with mxnet gluon
    The function takes loaded image and returns processed tensor
    '''
    img = np.array(Image.fromarray(img).resize((224, 224))).astype(np.float32)
    img[:, :, 0] -= 123.68
    img[:, :, 1] -= 116.779
    img[:, :, 2] -= 103.939
    img[:,:,[0,1,2]] = img[:,:,[2,1,0]]
    img = img.transpose((2, 0, 1))
    img = np.expand_dims(img, axis=0)

    return img


def predict(path):
    img_batch = preprocess(get_image(path))

    outputs = ort_session.run(
        None,
        {"input": img_batch.astype(np.float32)},
    )

    a = np.argsort(-outputs[0].flatten())
    results = {}
    for i in a[0:5]:
        results[labels[i]]=float(outputs[0][0][i])
    return results

ort_session = ort.InferenceSession("/content/google-net-onnx-test.onnx")

with open('synset.txt', 'r') as f:
    labels = [l.rstrip() for l in f]

image_path = "/content/dog.jpg"
predict(image_path)

我从本教程中获取了 Pytorch 的代码

以及来自github 的 ONNX Zoo 的 ONNX代码

编辑：

从@jhso 的评论来看，我认为规范化步骤：

平均值=[0.485, 0.456, 0.406]

在我看来，这相当于：

img[:, :, 0] -= 123.68
img[:, :, 1] -= 116.779
img[:, :, 2] -= 103.939

因为：

constant = 256
a,b,c =  123.68/constant, 116.779/constant, 103.939/constant

print (f'{a:.3f} {b:.3f} {c:.3f}')
0.483 0.456 0.406

关于 std 部分，我不确定它是否发生或是否相当于：

img[:,:,[0,1,2]] = img[:,:,[2,1,0]]
img = img.transpose((2, 0, 1))

我今天再次运行代码并得到了更接近的结果：

Answer 1

你的预处理是错误的。 请注意，您有一个中心裁剪（不太重要）和一个未使用的标准偏差标准化步骤。 您似乎也在从 BGR 转换，这在使用 PIL 时不需要（它更像是一个 opencv 的东西） - 如果我从记忆中出错，很高兴得到纠正。

preprocess = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

您的预处理阶段应该看起来像这样（ymmv）：

# Pre-processing function for ImageNet models using numpy
def preprocess(img):
    '''
    Preprocessing required on the images for inference with mxnet gluon
    The function takes loaded image and returns processed tensor
    '''
    img = np.array(Image.fromarray(img).resize((256, 256))).astype(np.float32)
    #center crop
    rm_pad = (256-224)//2 
    img = img[rm_pad:-rm_pad,rm_pad:-rm_pad]
    #normalize to 0-1
    img /= 255.
    #normalize by mean + std
    img = (img - np.array([0.485, 0.456, 0.406]))/np.array([0.229, 0.224, 0.225])
    # img[:,:,[0,1,2]] = img[:,:,[2,1,0]] #don't think this is needed?
    img = img.transpose((2, 0, 1))
    img = np.expand_dims(img, axis=0)

    return img

将模型从 pytorch 转换为 ONNX 后得到不同的结果

问题描述

编辑：

1 个解决方案

解决方案1
1 已采纳 2022-06-21 23:01:04

将模型从 pytorch 转换为 ONNX 后得到不同的结果

问题描述

编辑：

1 个解决方案

解决方案1 1 已采纳 2022-06-21 23:01:04

解决方案1
1 已采纳 2022-06-21 23:01:04