简体   繁体   English

根据endianess对整数进行位操作

[英]Bit manipulation of integers depending on endianess

My question concerns bit manipulation when the endianess changes. 当endianess改变时,我的问题涉及位操作。 In particular I have some code that reads individual bits of a uint32_t value and performs bit manipulation on them. 特别是我有一些代码可以读取uint32_t值的各个位并对它们执行位操作。 The purpose was UTF-8 encoding. 目的是UTF-8编码。 It works perfectly for my little endian machine. 它适用于我的小端机器。

Revisiting the code recently it dawned on me that I was not considering endianess of the machine as far as the uint32_t value's bit representation is concerned. 最近重新审视了代码,我突然意识到,就uint32_t值的位表示而言,我并没有考虑机器的endianess。 So I have some questions regarding that respect. 所以我对这方面有一些疑问。

Let's assume an example code that just requires bits 7-10 of an uint32_t saved in a different byte. 让我们假设一个示例代码, uint32_t保存在不同字节中的uint32_t 7-10位。

uint32_t v;
v = 18341;
char c = (v &(uint32_t) 0x3C0)>>6;

For little endian the number 18341 is represented as 0x47A5 or in binary: 对于小端,数字18341表示为0x47A5或二进制:

0100 01 11 10 10 0101 0100 01 11 10 10 0101

and the above code should give us 1110 stored in the char 而上面的代码应该给我们存储在char中的1110

Now the question is how would we achieve this in a Big Endian machine? 现在问题是我们如何在Big Endian机器中实现这一目标? The same number would be represented quite differently 0xA5470000 or in binary: 相同的数字将以完全不同的方式表示0xA5470000或二进制:

10 10 0101 0100 01 11 0000 0000 0000 0000 10 10 0101 0100 01 11 0000 0000 0000 0000

with the bits we seek being in totally different positions and not even consequent. 我们寻求的位在完全不同的位置,甚至没有结果。

Instead of using 0x3C0 at the other side of & we would have to use something else since the byte order is different. 而不是使用0x3C0对方我们将不得不因为字节顺序不同使用别的东西。 And especially since we need consequent bits of a byte we would require multiple boolean & operations like below right? 特别是因为我们需要一个字节的后续位,我们需要多个布尔操作,如下所示?

char c = ((v&(uint32_t)0xc0)>>6) | ((v&(uint32_t)0x300)>>6)

Summing up. 加起来。 Is my understanding correct that in the cases where we need to get sequential bits of an integer value represented in binary we would need to perform different manipulations for the two endianess cases? 我的理解是正确的,在我们需要获得以二进制表示的整数值的连续位的情况下,我们需要对两个endianess情况执行不同的操作吗?

Finally is there a better way to achieve the same thing than the one I showed above? 最后有没有比我上面展示的更好的方法来实现同样的东西? Maybe I am missing something totally obvious. 也许我错过了一些非常明显的东西。

No. If you are using values (like 0x300) and language operators (<<, |, &) it does not matter because the value will be represented according to the machine. 不。如果您使用的是值(如0x300)和语言运算符(<<,|,&),则无关紧要,因为该值将根据机器表示。 So in your case you do not need to worry about this problem. 因此,在您的情况下,您不必担心这个问题。 You should worry, for example, when you are copying bytes from file into the memory. 例如,当您将文件中的字节复制到内存中时,您应该担心。

If you are dealing with the memory representation directly, you can convert the representation before manipulation: 如果直接处理内存表示,可以在操作之前转换表示:

#if defined (BENDIAN)
   val = makelittle(val);
#endif
   manip_lendian(val);
#if defined (BENDIAN)
   val = makebig(val);
#endif

Also see this answer 见这个答案

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM