相同的权重，实现但不同的结果 n Keras 和 Pytorch

Question

I have an encoder and a decoder model ( monodepth2 ).我有一个编码器和一个解码器 model ( monodepth2 )。 I try convert them from Pytorch to Keras using Onnx2Keras , but:我尝试使用 Onnx2Keras 将它们从 Pytorch 转换为Onnx2Keras ，但是：

Encoder(ResNet-18) succeeds编码器（ResNet-18）成功
I build the decoder myself in Keras (with TF2.3 ), and copy the weights (numpy array, including weight and bias) for each layer from Pytorch to Keras, without any modification.我自己在 Keras（带有TF2.3 ）中构建了解码器，并将每一层的权重（numpy 数组，包括权重和偏差）从 Pytorch 复制到 Keras，无需任何修改。

But it turns out both Onnx2Keras -converted Encoder and self-built Decoder fails to reproduce the same results.但事实证明Onnx2Keras转换的编码器和自建解码器都无法重现相同的结果。 The cross-comparison pictures are below, but I'd first introduce the code of Decoder .下面是交叉比较的图片，但我先介绍一下Decoder的代码。

First the core Layer, all the conv2d layer ( Conv3x3 , ConvBlock ) is based on this, but different dims or add an activation:首先是核心层，所有的conv2d层（ Conv3x3 , ConvBlock ）都是基于此，但不同的暗淡或添加激活：

# Conv3x3 (normal conv2d without BN nor activation)
# There's also a ConvBlock, which is just "Conv3x3 + ELU activation", so I don't list it here.
def TF_Conv3x3(input_channel, filter_num, pad_mode='reflect', activate_type=None):

    # Actually it's 'reflect, but I implement it with tf.pad() outside this
    padding = 'valid'  

    # if TF_ConvBlock, then activate_type=='elu
    conv = tf.keras.layers.Conv2D(filters=filter_num, kernel_size=3, activation=activate_type,
                                  strides=1, padding=padding)
    return conv

Then the structure.然后是结构。 Note that the definition is EXACTLY the same as the original code .请注意，定义与原始代码完全相同。 I think it must be some details about the implementation.我认为它必须是有关实施的一些细节。

def DepthDecoder_keras(num_ch_enc=np.array([64, 64, 128, 256, 512]), channel_first=False,
                       scales=range(4), num_output_channels=1):
    num_ch_dec = np.array([16, 32, 64, 128, 256])
    convs = OrderedDict()
    for i in range(4, -1, -1):
        # upconv_0
        num_ch_in = num_ch_enc[-1] if i == 4 else num_ch_dec[i + 1]
        num_ch_out = num_ch_dec[i]

        # convs[("upconv", i, 0)] = ConvBlock(num_ch_in, num_ch_out)
        convs[("upconv", i, 0)] = TF_ConvBlock(num_ch_in, num_ch_out, pad_mode='reflect')


        # upconv_1
        num_ch_in = num_ch_dec[i]
        if i > 0:
            num_ch_in += num_ch_enc[i - 1]
        num_ch_out = num_ch_dec[i]
        convs[("upconv", i, 1)] = TF_ConvBlock(num_ch_in, num_ch_out, pad_mode='reflect')  # Just Conv3x3 with ELU-activation

    for s in scales:
        convs[("dispconv", s)] = TF_Conv3x3(num_ch_dec[s], num_output_channels, pad_mode='reflect')

    """
    Input_layer dims: (64, 96, 320), (64, 48, 160),  (128, 24, 80), (256, 12, 40), (512, 6, 20)
    """
    x0 = tf.keras.layers.Input(shape=(96, 320, 64))
    # then define the the rest input layers
    input_features = [x0, x1, x2, x3, x4]

    """
    # connect layers
    """
    outputs = []
    ch = 1 if channel_first else 3
    x = input_features[-1]
    for i in range(4, -1, -1):
        x = tf.pad(x, paddings=[[0, 0], [1, 1], [1, 1], [0, 0]], mode='REFLECT')
        x = convs[("upconv", i, 0)](x)
        x = [tf.keras.layers.UpSampling2D()(x)]
        if i > 0:
            x += [input_features[i - 1]]
        x = tf.concat(x, ch)
        x = tf.pad(x, paddings=[[0, 0], [1, 1], [1, 1], [0, 0]], mode='REFLECT')
        x = convs[("upconv", i, 1)](x)
    x = TF_ReflectPad2D_1()(x)
    x = convs[("dispconv", 0)](x)
    disp0 = tf.math.sigmoid(x)

    """
    build keras Model ([input0, ...], [output0, ...])
    """
    # decoder = tf.keras.Model(input_features, outputs)
    decoder = tf.keras.Model(input_features, disp0)

    return decoder

The cross-comparison is as follows... I would really appreciate it if anyone could offer some insights.交叉比较如下......如果有人能提供一些见解，我将不胜感激。 Thanks!!!谢谢！！！

Original results:原始结果：

Original Encoder + Self-build Decoder:原编码器+自建解码器：

ONNX-converted Enc + Original Dec (Texture is good, but the contrast is not enough, the car should be very close, ie very bright color): ONNX-converted Enc + Original Dec（质感不错，但对比度不够，车应该很接近，即颜色很亮）：

ONNX-converted Enc + Self-built Dec: ONNX 转换 Enc + 自建 Dec：

Answer 1

Solved!解决了！

It turns out there's indeed no problem with implementation (at least not significant ones).事实证明，实施确实没有问题（至少不是重要的问题）。 It's the problem with weights copying.这是weights复制的问题。

The original weights has (H, W, 3, 3), but TF-model requires dim of (3, 3, W, H), so I permuted it by [3,2,1,0], overlooking the (3, 3) also have their own sequence.原始权重有 (H, W, 3, 3)，但 TF 模型需要 (3, 3, W, H) 的暗淡，所以我将其置换为 [3,2,1,0]，忽略了 (3 , 3) 也有自己的顺序。

So it should be weights.permute([2,3,1,0]) , and all is well!所以应该是weights.permute([2,3,1,0]) ，一切都很好！

相同的权重，实现但不同的结果 n Keras 和 Pytorch

问题描述

1 个解决方案

解决方案1
3 2021-03-22 12:06:39

相同的权重，实现但不同的结果 n Keras 和 Pytorch

问题描述

1 个解决方案

解决方案1 3 2021-03-22 12:06:39

解决方案1
3 2021-03-22 12:06:39