简体   繁体   English

如何将 uint32 Numpy 数组转换为 4 uint8 Numpy arrays 照顾字节序

[英]How to convert a uint32 Numpy array into 4 uint8 Numpy arrays taking care of endianness

For context, I'm writing a Python program that writes out images.对于上下文,我正在编写一个写出图像的 Python 程序。 But these images are a bit special in the sense that they are used as intermediate data containers that are further digested by other programs written in C and use the libgd library.但是这些图像有点特殊,因为它们被用作中间数据容器,被其他用 C 编写的程序进一步消化并使用libgd库。 I have no clue about C.我对 C 一无所知。

My precise problem is that I have a Numpy array of dtype='uint32' .我的确切问题是我有一个 dtype dtype='uint32'的 Numpy 数组。 I want to decode this array to get 4 arrays of dtype='uint8' , and then use them to write out an image.我想解码这个数组以获得 dtype dtype='uint8' 4 个 arrays ,然后用它们写出图像。 This can be done with numpy.view :这可以通过numpy.view来完成:

img_decoded = img_coded[:, :, np.newaxis].view('uint8')

Now, img_decoded is of shape (dimY, dimX, 4) .现在, img_decoded的形状为(dimY, dimX, 4) My doubt is what index of the third dimension should I make correspond to what channel .我的疑问是我应该使第三维的什么索引对应于什么通道 The C programs I'm interacting with expect that the most significative byte is written to the Alpha channel, then Red, then Green and finally Blue.我正在与之交互的 C 程序期望最重要的字节写入 Alpha 通道,然后是红色,然后是绿色,最后是蓝色。 How can I make sure that this correspondence is fulfilled?我怎样才能确保这封信得到履行? I'm aware this has something to do with endianness, but this concept is still fuzzy to me.我知道这与字节序有关,但这个概念对我来说仍然很模糊。


Related to all this, I have been playing with this to try to gain insight in these concepts, but yet commands like this blow my mind:与所有这些相关,我一直在玩这个来尝试深入了解这些概念,但是像这样的命令让我大吃一惊:

In []: np.array([256 * 4 + 1], dtype='uint16').view(dtype='uint8')
Out[]: array([1, 4], dtype=uint8)

What does this tell me about the order of the most significant bit?这告诉我最高有效位的顺序是什么? Why is the output [1,4] and not the other way around?为什么是 output [1,4]而不是相反? What has this to do with endianness?这与字节序有什么关系?

The C programs I'm interacting with expect that the most significative byte is written to the Alpha channel, then Red, then Green and finally Blue.我正在与之交互的 C 程序期望最重要的字节写入 Alpha 通道,然后是红色,然后是绿色,最后是蓝色。 How can I make sure that this correspondence is fulfilled?我怎样才能确保这封信得到履行?

This is highly dependent of both the pixel encoding method and the target platform .这高度依赖于像素编码方法目标平台

Regarding the encoding, some libraries use the BGRA format while some use the RGBA format for example.关于编码,有些库使用 BGRA 格式,而有些库使用 RGBA 格式。 Many support multiple format but one need to be selected at a time.许多支持多种格式,但需要一次选择一种。

On conventional/mainstream platforms, an uint32 type is composed of 4 x 8 bits and is stored in 4 consecutive 8-bit bytes of memory.在常规/主流平台上,一个uint32类型由 4 x 8 位组成,存储在 memory 的 4 个连续 8 位字节中。 The 8 most significant bits can be stored in the byte with the lowest memory address or the highest memory address regarding the platform.最高 8 位可以存储在与平台有关的最低 memory 地址或最高 memory 地址的字节中。 This is indeed what is called endianness .这确实是所谓的字节序 Some platform can have weird endianness (like middle endian) or can support multiple endianness resulting in some case to runtime-defined endianness (AFAIK, ARM and POWER for example support that although the "default" endianness should be the little-endian nowadays).某些平台可能具有奇怪的字节序(如中间字节序),或者可以支持多字节序,在某些情况下导致运行时定义的字节序(例如,AFAIK、ARM 和 POWER 支持虽然“默认”字节序现在应该是小字节序)。 Endianness issues happens only on native types (or low-level unions) with a size of multiple bytes.字节顺序问题仅发生在具有多个字节大小的本机类型(或低级联合)上。

You can check the endianness at runtime with the example code you provided (although using a uint32 -typed variable is safer).您可以在运行时使用您提供的示例代码检查字节顺序(尽管使用uint32类型的变量更安全)。 Regarding the result (ie. [1, 4] or [4, 1] ) you can guess the endianness.关于结果(即[1, 4][4, 1] ),您可以猜测字节序。 Based on the endianness, you can use a if-else statement to encode, decode or even directly compute the pixels (you can put that in a generic encoding/decoding function).基于字节序,您可以使用 if-else 语句来编码、解码甚至直接计算像素(您可以将其放入通用编码/解码函数中)。

An alternative solution is not to use views at all and use portable bit-wise operations (independent of the endianness of the target platform).另一种解决方案是根本不使用视图并使用可移植的按位操作(独立于目标平台的字节序)。

Here is an example:这是一个例子:

alpha = img_coded >> 24
red = (img_coded >> 16) & 0xFF
green = (img_coded >> 8) & 0xFF
blue = img_coded & 0xFF

What does this tell me about the order of the most significant bit?这告诉我最高有效位的顺序是什么? Why is the output [1,4] and not the other way around?为什么是 output [1,4] 而不是相反? What has this to do with endianness?这与字节序有什么关系?

This means your platform use the little-endian format.这意味着您的平台使用 little-endian 格式。 This is what mainstream x86-64 platforms use.这是主流 x86-64 平台使用的。 The little-endian format store the less-significant bytes first ( 1 here). little-endian 格式首先存储不太重要的字节(此处为1 )。 The same code on a big-endian platform should result in [4,1] .大端平台上的相同代码应该导致[4,1]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM