[英]Junk appearing when converting to char from int32 in C/C++
I am using a simple code to convert an uint32_t variable to char. 我正在使用一个简单的代码将uint32_t变量转换为char。
uint32_t len = 4 + data.length(); //data is a string
char pop1 = len & 0xff;
char pop2 = (len >> 8) & 0xff;
char pop3 = (len >> 16) & 0xff;
char pop4 = (len >> 24) & 0xff; //also tried the same thing with memcpy
printf("%02x \n",pop1);
printf("%02x \n",pop2);
printf("%02x \n",pop3);
printf("%02x \n",pop4);
Output : 输出:
ffffff81
02
00
00
I fail to understand why the junk is added to the first byte. 我不明白为什么将垃圾添加到第一个字节。 When I use unsigned char
instead, no junk is added. 当我改用unsigned char
,不会添加任何垃圾。 In my understanding both char and unsigned char are 8-bits, then why the char is treated like a 32-bit value. 以我的理解,char和unsigned char都是8位,那么为什么将char视为32位值。 I am using VS2015 on a 64 bit Windows machine. 我在64位Windows计算机上使用VS2015。 I want to use the char array for the send function of WinSock2. 我想将char数组用于WinSock2的发送功能。
send(ConnectSocket, sendbuf, size_to_send, 0); // sendbuf is a char array
When used in an expression, a char
is first promoted to int
. 在表达式中使用char
,首先将其提升为int
。 So if the value of the char
value is negative, that value is preserved when it is converted to int
, and that is what you see when you print. 因此,如果char
值的值为负,则将该值转换为int
时将保留该值,这就是打印时看到的。
You can either cast the value to unsigned char
to have it take on a positive value, or you can use the hh
modifier on the %x
format specifier to have it treat the value as an unsigned char
. 您可以将值unsigned char
转换为unsigned char
以使其具有正值,也可以在%x
格式说明符上使用hh
修饰符以使其将值视为unsigned char
。
printf("%02hhx \n",pop1);
printf("%02hhx \n",pop2);
printf("%02hhx \n",pop3);
printf("%02hhx \n",pop4);
When a value of a small integer type (like char
) is passed as an argument to a vararg function (like printf
) it is promoted to an int
当将小整数类型的值(如char
)作为参数传递给vararg函数(如printf
)时,它将提升为int
This promotion can include sign extension if the small type is signed
. 如果对小型字体进行了signed
则此促销可以包括符号扩展名 。
On two's complement systems (which is the vast majority of computers a long time) that mean the int
will be padded with 1
bits, which when printed as an unsigned int
in hexadecimal will manifest as f
. 在二进制补码系统(很长一段时间以来,这是绝大多数计算机)上,这意味着int
将填充1
位,当以十六进制形式将unsigned int
打印为unsigned int
,它将显示为f
。
The simple solution is to not use char
, but preferably uint8_t
or an explicit unsigned char
type for your variables. 简单的解决方案是不使用char
,但最好将uint8_t
或显式的unsigned char
类型用于变量。
Think of all the type changes and conversion happening. 考虑所有发生的类型更改和转换。 There are at least 4. 至少有4个。
0xff
, an int
is converted to uint32_t
, then the &
occurs. 0xff
,将int
转换为uint32_t
,然后出现&
。 No problems here. 没问题
len & 0xff;
Then that result is assigned to a char
, a signed char
in OP's case. 然后将该结果分配给char
,在OP的情况下为签名 char
。 That assigns a 0x81
(129) that is out-of-range to the char
--> Implementation defined behavior . 指派一个0x81
(129),该外的范围内的char
- > 实施定义的行为 。 A common result simply passes the smallest bits. 常见的结果只是传递最小的位。
char pop1 = len & 0xff;
why the char is treated like a 32-bit value (?) 为什么将char视为32位值(?)
It is not treated yet as a 32-bit unsigned value, but as an 8-bit signed value. 尚未将其视为32位无符号值,而是8位有符号值。
Then code pass char pop1
(with maybe a value of -127) to printf();
然后代码将char pop1
(可能值为-127)传递给printf();
;。 and incurs an default argument promotion as an argument to the ... function. 并导致默认参数提升作为...函数的参数。 printf()
receives an int
with the value of -127. printf()
接收一个值为-127的int
值。
printf(...,pop1);
printf("%02x \\n",pop1);
expects an unsigned
and not an int
. 需要一个unsigned
而不是int
。 As the value of -127 is not representable as both an int
and unsigned
, (c11 §6.5.2.2 6), the conversion specifier is not valid with that argument and the result is undefined behavior (UB). 由于-127的值不能同时表示为int
和unsigned
(c11§6.5.2.26),因此转换说明符对该参数无效,并且结果为未定义行为 (UB)。 (§7.21.6.1 9). (§7.21.6.19)。 What typically happens is that the bit pattern of -127
passed as an int
is interpreted as the bit pattern for an unsigned
and result in "ffffff81"
. 通常发生的情况是,将-127
的位模式作为int
传递为unsigned
的位模式,并导致"ffffff81"
。
printf("%02x \n",pop1);
To avoid the implementation defined behavior and the UB, recommend the below. 为避免实现定义的行为和UB,建议使用以下内容。 For effective unsigned code, use clearly unsigned types, objects and constants. 对于有效的无符号代码,请使用明显无符号的类型,对象和常量。
unsigned char pop1 = len & 0xffu;
// or
uint8_t pop1 = len & 0xffu;
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.