简体   繁体   English

通过memcpy将unsigned char数组转换为unsigned int返回unsigned char数组是相反的

[英]unsigned char array to unsigned int back to unsigned char array via memcpy is reversed

This isn't cross-platform code... everything is being performed on the same platform (ie endianess is the same.. little endian). 这不是跨平台代码......一切都在同一平台上执行(即endianess是相同的..小端)。

I have this code: 我有这个代码:


    unsigned char array[4] = {'t', 'e', 's', 't'};
unsigned int out = ((array[0]<<24)|(array[1]<<16)|(array[2]<<8)|(array[3])); std::cout << out << std::endl;
unsigned char buff[4]; memcpy(buff, &out, sizeof(unsigned int));
std::cout << buff << std::endl;

I'd expect the output of buff to be "test" (with a garbage trailing character because of the lack of '/0') but instead the output is "tset." 我希望buff的输出是“test”(由于缺少'/ 0'而带有垃圾尾随字符),而输出是“tset”。 Obviously changing the order of characters that I'm shifting (3, 2, 1, 0 instead of 0, 1, 2, 3) fixes the problem, but I don't understand the problem. 显然改变我正在移动的字符顺序(3,2,1,0而不是0,1,2,3)解决了问题,但我不明白这个问题。 Is memcpy not acting the way I expect? memcpy不是按照我的预期行事吗?

Thanks. 谢谢。

This is because your CPU is little-endian . 这是因为你的CPU是小端的 In memory, the array is stored as: 在内存中,数组存储为:

      +----+----+----+----+
array | 74 | 65 | 73 | 74 |
      +----+----+----+----+

This is represented with increasing byte addresses to the right . 这通过增加 右侧的字节地址来表示。 However, the integer is stored in memory with the least significant bytes at the left: 但是,整数存储在内存中,左侧的最低有效字节:

    +----+----+----+----+
out | 74 | 73 | 65 | 74 |
    +----+----+----+----+

This happens to represent the integer 0x74657374. 这恰好代表整数0x74657374。 Using memcpy() to copy that into buff reverses the bytes from your original array . 使用memcpy()将其复制到buff反转原始array的字节。

You're running this on a little-endian platform. 你是在一个小端平台上运行它。

On a little-endian platform, a 32-bit int is stored in memory with the least significant byte in the lowest memory address. 在little-endian平台上,32位int存储在内存中,最低内存地址中的最低有效字节。 So bits 0-7 are stored at address P, bits 8-15 in address P + 1, bits 16-23 in address P + 2 and bits 24-31 in address P + 3. 因此,位0-7存储在地址P,位8-15存储在地址P + 1中,位16-23存储在地址P + 2中,位24-31存储在地址P + 3中。

In your example: bits 0-7 = 't', bits 8-15 = 's', bits 16-23 = 'e', bits 24-31 = 't' 在您的示例中:位0-7 ='t',位8-15 ='s',位16-23 ='e',位24-31 ='t'

So that's the order that the bytes are written to memory: "tset" 这就是字节写入内存的顺序:“tset”

If you address the memory then as separate bytes (unsigned chars), you'll read them in the order they are written to memory. 如果您将内存作为单独的字节(无符号字符)进行寻址,则将按照它们写入内存的顺序读取它们。

On a little-endian platform the output should be tset . 在little-endian平台上,输出应该是tset The original sequence was test from lower addresses to higher addresses. 原始序列是从较低地址到较高地址的test Then you put it into an unsigned int with first 't' going into the most significant byte and the last 't' going into the least significant byte. 然后你把它放入一个unsigned int ,第一个't'进入最高有效字节,最后't'进入最低有效字节。 On a little-endian machine the least significant byte is stored at lower address. 在小端机器上,最低有效字节存储在较低地址。 This is how it will be copied to the final buf . 这是如何将它复制到最终的buf This is how it is going to be output: from the last 't' to the first 't', ie tset . 这是它的输出方式:从最后的't'到第一个't',即tset

On a big-endian machine you would not observe the reversal. 在大端机器上你不会观察到逆转。

如何在你的buff添加'\\0'

您已经为平台字节顺序编写了一个测试,它已经得出结论: little endian

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM