简体   繁体   English

在C / C ++中从int32转换为char时出现垃圾

[英]Junk appearing when converting to char from int32 in C/C++

I am using a simple code to convert an uint32_t variable to char. 我正在使用一个简单的代码将uint32_t变量转换为char。

uint32_t len = 4 + data.length(); //data is a string
char pop1 = len & 0xff;
char pop2 = (len >> 8) & 0xff;
char pop3 = (len >> 16) & 0xff;
char pop4 = (len >> 24) & 0xff;     //also tried the same thing with memcpy

printf("%02x \n",pop1);
printf("%02x \n",pop2);
printf("%02x \n",pop3);
printf("%02x \n",pop4); 

Output : 输出:

ffffff81
02
00
00

I fail to understand why the junk is added to the first byte. 我不明白为什么将垃圾添加到第一个字节。 When I use unsigned char instead, no junk is added. 当我改用unsigned char ,不会添加任何垃圾。 In my understanding both char and unsigned char are 8-bits, then why the char is treated like a 32-bit value. 以我的理解,char和unsigned char都是8位,那么为什么将char视为32位值。 I am using VS2015 on a 64 bit Windows machine. 我在64位Windows计算机上使用VS2015。 I want to use the char array for the send function of WinSock2. 我想将char数组用于WinSock2的发送功能。

send(ConnectSocket, sendbuf, size_to_send, 0); // sendbuf is a char array

When used in an expression, a char is first promoted to int . 在表达式中使用char ,首先将其提升int So if the value of the char value is negative, that value is preserved when it is converted to int , and that is what you see when you print. 因此,如果char值的值为负,则将该值转换为int时将保留该值,这就是打印时看到的。

You can either cast the value to unsigned char to have it take on a positive value, or you can use the hh modifier on the %x format specifier to have it treat the value as an unsigned char . 您可以将值unsigned char转换为unsigned char以使其具有正值,也可以在%x格式说明符上使用hh修饰符以使其将值视为unsigned char

printf("%02hhx \n",pop1);
printf("%02hhx \n",pop2);
printf("%02hhx \n",pop3);
printf("%02hhx \n",pop4);

When a value of a small integer type (like char ) is passed as an argument to a vararg function (like printf ) it is promoted to an int 当将小整数类型的值(如char )作为参数传递给vararg函数(如printf )时,它将提升int

This promotion can include sign extension if the small type is signed . 如果对小型字体进行了signed则此促销可以包括符号扩展名

On two's complement systems (which is the vast majority of computers a long time) that mean the int will be padded with 1 bits, which when printed as an unsigned int in hexadecimal will manifest as f . 二进制补码系统(很长一段时间以来,这是绝大多数计算机)上,这意味着int将填充1位,当以十六进制形式将unsigned int打印为unsigned int ,它将显示为f

The simple solution is to not use char , but preferably uint8_t or an explicit unsigned char type for your variables. 简单的解决方案是不使用char ,但最好将uint8_t或显式的unsigned char类型用于变量。

Think of all the type changes and conversion happening. 考虑所有发生的类型更改和转换。 There are at least 4. 至少有4个。

0xff , an int is converted to uint32_t , then the & occurs. 0xff ,将int转换为uint32_t ,然后出现& No problems here. 没问题

len & 0xff;

Then that result is assigned to a char , a signed char in OP's case. 然后将该结果分配给char ,在OP的情况下为签名 char That assigns a 0x81 (129) that is out-of-range to the char --> Implementation defined behavior . 指派一个0x81 (129),该外的范围内的char - > 实施定义的行为 A common result simply passes the smallest bits. 常见的结果只是传递最小的位。

char pop1 = len & 0xff;

why the char is treated like a 32-bit value (?) 为什么将char视为32位值(?)

It is not treated yet as a 32-bit unsigned value, but as an 8-bit signed value. 尚未将其视为32位无符号值,而是8位有符号值。

Then code pass char pop1 (with maybe a value of -127) to printf(); 然后代码将char pop1 (可能值为-127)传递给printf(); ;。 and incurs an default argument promotion as an argument to the ... function. 并导致默认参数提升作为...函数的参数。 printf() receives an int with the value of -127. printf()接收一个值为-127的int值。

printf(...,pop1);

printf("%02x \\n",pop1); expects an unsigned and not an int . 需要一个unsigned而不是int As the value of -127 is not representable as both an int and unsigned , (c11 §6.5.2.2 6), the conversion specifier is not valid with that argument and the result is undefined behavior (UB). 由于-127的值不能同时表示为intunsigned (c11§6.5.2.26),因此转换说明符对该参数无效,并且结果为未定义行为 (UB)。 (§7.21.6.1 9). (§7.21.6.19)。 What typically happens is that the bit pattern of -127 passed as an int is interpreted as the bit pattern for an unsigned and result in "ffffff81" . 通常发生的情况是,将-127的位模式作为int传递为unsigned的位模式,并导致"ffffff81"

printf("%02x \n",pop1);

To avoid the implementation defined behavior and the UB, recommend the below. 为避免实现定义的行为和UB,建议使用以下内容。 For effective unsigned code, use clearly unsigned types, objects and constants. 对于有效的无符号代码,请使用明显无符号的类型,对象和常量。

unsigned char pop1 = len & 0xffu;
// or
uint8_t pop1 = len & 0xffu;

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM