C#和Python中的JPEG压缩差异

Question

I am moving some image processing functionality from .NET to Python under the constraint that the output images must be compressed in the exact same way as they were in .NET. However, when I compare the .jpg output files on a tool like text-compare and choose Ignore nothing , there are significant differences in how the files were compressed.我正在将一些图像处理功能从 .NET 移动到 Python，条件是 output 图像必须以与 .NET 中完全相同的方式压缩。但是，当我在文本比较等工具上比较.jpg output 文件时选择Ignore nothing ，文件的压缩方式存在显着差异。

For example:例如：

Python Python

bmp = PIL.Image.open('marbles.bmp')

bmp.save(
    'output_python.jpg',
    format='jpeg',
    dpi=(300,300),
    subsampling=2,
    quality=75
)

.NET .NET

ImageCodecInfo jgpEncoder = ImageCodecInfo.GetImageDecoders().First(codec => codec.FormatID == ImageFormat.Jpeg.Guid);
EncoderParameters myEncoderParameters = new EncoderParameters(1);
myEncoderParameters.Param[0] = new EncoderParameter(Encoder.Quality, 75L);

Bitmap bmp = new Bitmap(directory + "marbles.bmp");

bmp.Save(directory + "output_net.jpg", jgpEncoder, myEncoderParameters);

exiftool output_python.jpg -a -G1 -w txt

[ExifTool]      ExifTool Version Number         : 12.31
[System]        File Name                       : output_python.jpg
[System]        Directory                       : .
[System]        File Size                       : 148 KiB
[System]        File Modification Date/Time     : 2021:09:28 09:19:20-06:00
[System]        File Access Date/Time           : 2021:09:28 09:19:21-06:00
[System]        File Creation Date/Time         : 2021:09:27 21:33:35-06:00
[System]        File Permissions                : -rw-rw-rw-
[File]          File Type                       : JPEG
[File]          File Type Extension             : jpg
[File]          MIME Type                       : image/jpeg
[File]          Image Width                     : 1419
[File]          Image Height                    : 1001
[File]          Encoding Process                : Baseline DCT, Huffman coding
[File]          Bits Per Sample                 : 8
[File]          Color Components                : 3
[File]          Y Cb Cr Sub Sampling            : YCbCr4:2:0 (2 2)
[JFIF]          JFIF Version                    : 1.01
[JFIF]          Resolution Unit                 : inches
[JFIF]          X Resolution                    : 300
[JFIF]          Y Resolution                    : 300
[Composite]     Image Size                      : 1419x1001
[Composite]     Megapixels                      : 1.4

exiftool output.net.jpg -a -G1 -w txt

[ExifTool]      ExifTool Version Number         : 12.31
[System]        File Name                       : output_net.jpg
[System]        Directory                       : .
[System]        File Size                       : 147 KiB
[System]        File Modification Date/Time     : 2021:09:28 09:18:05-06:00
[System]        File Access Date/Time           : 2021:09:28 09:18:52-06:00
[System]        File Creation Date/Time         : 2021:09:27 21:32:19-06:00
[System]        File Permissions                : -rw-rw-rw-
[File]          File Type                       : JPEG
[File]          File Type Extension             : jpg
[File]          MIME Type                       : image/jpeg
[File]          Image Width                     : 1419
[File]          Image Height                    : 1001
[File]          Encoding Process                : Baseline DCT, Huffman coding
[File]          Bits Per Sample                 : 8
[File]          Color Components                : 3
[File]          Y Cb Cr Sub Sampling            : YCbCr4:2:0 (2 2)
[JFIF]          JFIF Version                    : 1.01
[JFIF]          Resolution Unit                 : inches
[JFIF]          X Resolution                    : 300
[JFIF]          Y Resolution                    : 300
[Composite]     Image Size                      : 1419x1001
[Composite]     Megapixels                      : 1.4

marbles.bmp sample image marbles.bmp 样本图像

Difference on text-compare文本比较的差异

Questions问题

Is it reasonable to assume that these two implementations of JPEG compression could yield identical output files?假设这两种 JPEG 压缩实现可以产生相同的 output 个文件是否合理？
If so, are either PIL or System.Drawing.Image doing any extra steps like anti-aliasing that are making the results different?如果是这样， PIL或System.Drawing.Image是否执行了任何额外的步骤，例如使结果不同的抗锯齿？
Or are there additional parameters to PIL .save() to make it behave more like the JPEG encoder in C#?或者PIL .save()是否有其他参数以使其表现得更像 C# 中的 JPEG 编码器？

Thanks谢谢

Update更新

Based on Jeremy's recommendation , I used JPEGsnoop to compare more details between the files and found that the Luminance and Chrominance tables were different.根据Jeremy 的建议，我使用JPEGsnoop比较文件之间的更多细节，发现亮度和色度表不同。 I modified the code:我修改了代码：

bmp = PIL.Image.open('marbles.bmp')

output_net = PIL.Image.open('output_net.jpg')

bmp.save(
    'output_python.jpg',
    format='jpeg',
    dpi=(300,300),
    subsampling=2,
    qtables=output_net.quantization,
    #quality=75
)

Now the tables are the same, but the difference between the files is unchanged.现在表是一样的，但是文件之间的区别没有改变。 The only differences JPEGsnoop shows now are in the Compression stats and Huffman code histogram stats . JPEGsnoop 现在显示的唯一区别是Compression stats和Huffman code histogram stats 。

output.net.jpeg

*** Decoding SCAN Data ***
  OFFSET: 0x0000026F
  Scan Decode Mode: Full IDCT (AC + DC)

  Scan Data encountered marker   0xFFD9 @ 0x00024BE7.0

  Compression stats:
    Compression Ratio: 28.43:1
    Bits per pixel:     0.84:1

  Huffman code histogram stats:
    Huffman Table: (Dest ID: 0, Class: DC)
      # codes of length 01 bits:        0 (  0%)
      # codes of length 02 bits:     1664 (  7%)
      # codes of length 03 bits:    18238 ( 81%)
      # codes of length 04 bits:     1807 (  8%)
      # codes of length 05 bits:      715 (  3%)
      # codes of length 06 bits:        4 (  0%)
      # codes of length 07 bits:        0 (  0%)
      ...

output_python.jpg

*** Decoding SCAN Data ***
  OFFSET: 0x0000026F
  Scan Decode Mode: Full IDCT (AC + DC)

  Scan Data encountered marker   0xFFD9 @ 0x00025158.0

  Compression stats:
    Compression Ratio: 28.17:1
    Bits per pixel:     0.85:1

  Huffman code histogram stats:
    Huffman Table: (Dest ID: 0, Class: DC)
      # codes of length 01 bits:        0 (  0%)
      # codes of length 02 bits:     1659 (  7%)
      # codes of length 03 bits:    18247 ( 81%)
      # codes of length 04 bits:     1807 (  8%)
      # codes of length 05 bits:      711 (  3%)
      # codes of length 06 bits:        4 (  0%)
      # codes of length 07 bits:        0 (  0%)
      ...

I am now looking for a way to sync these values through PIL .我现在正在寻找一种通过PIL同步这些值的方法。

Answer 1

Is it reasonable to assume that these two implementations of JPEG compression could yield identical output files?假设这两种 JPEG 压缩实现可以产生相同的 output 个文件是否合理？

The answer is not really.答案不是真的。

The point of the JPEG compression is high compression with loss. JPEG 压缩的要点是高压缩有损。 Even with the quality setting of 100, loss is inevitable, given the algorithm requires infinite precision to exactly replicate the source image.即使质量设置为 100，损失也是不可避免的，因为算法需要无限精度才能准确复制源图像。

It is possible to produce identical files, if both algorithms are coded identically using the same parameters: precision, boundary selection and padding/offset specifications to provide the power of 2 size for the FFT.如果两种算法使用相同的参数进行相同的编码，则可以生成相同的文件：精度、边界选择和填充/偏移规范，以便为 FFT 提供 2 的幂大小。

Implementations of the JPEG algorithm may use pre-passes to optimize the parameters of the algorithm. JPEG 算法的实现可以使用预传递来优化算法的参数。

Given that the optimizations of the parameters differs between the two implementations, it is unlikely that the outputs would be identical.鉴于参数的优化在两个实现之间不同，输出不太可能相同。

Are there additional parameters to PIL.save() to make it behave more like the JPEG encoder in C#? PIL.save() 是否有其他参数以使其表现得更像 C# 中的 JPEG 编码器？

I cannot answer this question directly, but, you can use the package: Python for.NET to access the C# JPEG encoder from Python. This solution would provide consistent identical results.我无法直接回答这个问题，但是，您可以使用 package: Python for.NET从 Python 访问 C# JPEG 编码器。此解决方案将提供一致的相同结果。

Why would anyone need binary compatibility, other than the educational value?除了教育价值之外，为什么有人需要二进制兼容性？

In all of my perceived practical scenarios addressing the question, the only need is to save an additional hash of the image: save the new hash in a separate field.在我认为解决这个问题的所有实际场景中，唯一需要的是保存图像的额外 hash：将新的 hash 保存在一个单独的字段中。

Pick a technology and use it until it no longer fits your needs/requirements.选择一项技术并使用它，直到它不再满足您的需求/要求。 When it doesn't (preferably well before), find shims to fill the gap and rewrite the code to utilize the new technology.如果没有（最好早些），找到垫片来填补空白并重写代码以利用新技术。

Answer 2

I do not believe JPEG is deterministic, so I would expect different implementations to produce different binaries.我不相信 JPEG 是确定性的，所以我希望不同的实现会产生不同的二进制文件。 I don't have any reference to support that assertion.我没有任何参考资料来支持该断言。 In fact I would not expect .NET to be entirely consistent across the lifetime of the API as .NET 1.1 on Windows 98 would in my opinion be unlikely to produce the same output as .NET 4.8 on Windows 11 until tested and proven otherwise. In fact I would not expect .NET to be entirely consistent across the lifetime of the API as .NET 1.1 on Windows 98 would in my opinion be unlikely to produce the same output as .NET 4.8 on Windows 11 until tested and proven otherwise. You should confirm that your oldest images produced at the beginning of the application lifecycle still convert identically today.您应该确认在应用程序生命周期开始时生成的最旧图像在今天仍然可以进行相同的转换。

[Edit: I see Strom mentioned Python.NET. [编辑：我看到 Strom 提到了 Python.NET。 I will still include my code here but recommend not to roll your own.]我仍会在此处包含我的代码，但建议不要自行编写。]

Instead I would approach this by having the Python code call the .NET function. Untested:相反，我会通过让 Python 代码调用 .NET function 来解决这个问题。未经测试：

jpe.net.cs jpe.net.cs

using [...]

class JPEGNET
{
    [DllExport("save", CallingConvention = CallingConvention.Cdecl)]
    public static int save()
    {
        ImageCodecInfo jgpEncoder = ImageCodecInfo.GetImageDecoders().First(codec => codec.FormatID == ImageFormat.Jpeg.Guid);
        EncoderParameters myEncoderParameters = new EncoderParameters(1);
        myEncoderParameters.Param[0] = new EncoderParameter(Encoder.Quality, 75L);

        Bitmap bmp = new Bitmap(directory + "marbles.bmp");

        bmp.Save(directory + "output_net.jpg", jgpEncoder, myEncoderParameters);
    }
}

jpe.net.py jpe.net.py

import ctypes
jpegnet = ctypes.cdll.LoadLibrary(source)
jpegnet.save()

C#和Python中的JPEG压缩差异

问题描述

2 个解决方案

解决方案1
2 2021-10-04 01:18:40

解决方案2
0 2021-10-06 22:07:11

C#和Python中的JPEG压缩差异

问题描述

2 个解决方案

解决方案1 2 2021-10-04 01:18:40

解决方案2 0 2021-10-06 22:07:11

解决方案1
2 2021-10-04 01:18:40

解决方案2
0 2021-10-06 22:07:11