简体   繁体   English

C++ 将字符转换为短

[英]C++ casting char into short

Pardon me for this newbie question.请原谅我这个新手问题。 I recently found that a strange thing when casting char into short.我最近在将 char 转换为 short 时发现了一件奇怪的事情。 Basically, if the char is overflowed, when casting into short the binary number is prepended with 11111111. If the char is not overflowed, it will be prepended with 00000000.基本上,如果 char 溢出,当转换成 short 时,二进制数会在前面加上 11111111。如果 char 没有溢出,它将在前面加上 00000000。

For example,例如,

char a = 130;
short b = (short)a;
printf("%hhx\n", a);
printf("%hx\n", b);

prints印刷

82
ff82

While尽管

char a = 125;
short b = (short)a;
printf("%hhx\n", a);
printf("%hx\n", b);

prints印刷

7d
7d

So when doing casting, do variable type and value get checked before deciding what exactly binary number it's casted into (deciding b/w prepending 0xFF or 0x00)?因此,在进行强制转换时,在确定将其强制转换为哪个二进制数(决定 b/w 前置 0xFF 或 0x00)之前是否检查变量类型和值? Is there any reason behind this?这背后有什么原因吗? It seems always doing (short)a & 0x00FF would be a good practice?似乎总是做(short)a & 0x00FF会是一个好习惯吗?

char a = 130;

There's a high chance that char is 8 bits on your system, and we can guess based on the output that it is a signed type. char在您的系统上很有可能是 8 位,我们可以根据 output 猜测它是有符号类型。 In that case, the largest representable value of char is 127. 130 is greater than 127, so it isn't representable.在这种情况下, char的最大可表示值是 127。130 大于 127,因此无法表示。 In this case, the converted value will be the representable value that is congruent with 130 modulo 128, which is -126.在这种情况下,转换后的值将是与 130 模 128 一致的可表示值,即 -126。 When you convert to the two byte short , the value remains the same -126.当您转换为两个字节的short时,该值保持不变 -126。 ff82 is how -126 is represented as a two byte two's complement number. ff82 是如何将 -126 表示为两个字节的二进制补码数。

It seems always doing (short)a & 0x00FF would be a good practice?似乎总是做 (short)a & 0x00FF 会是一个好习惯吗?

If you did that, then the value of b would be different (130) from the value of a (-126).如果你这样做了,那么b的值将不同于 (130) a值 (-126)。 Is it a "good practice" to get one result as opposed to another result?获得一个结果而不是另一个结果是一种“好习惯”吗? That depends on which result you need.这取决于您需要哪种结果。

Bit masking only really makes sense with unsigned types.位掩码仅对无符号类型才真正有意义。

Assigning an unrepresentable value to a signed integer type rarely makes sense.将不可表示的值分配给带符号的 integer 类型很少有意义。

Read up on: 2's complement for how negative numbers are encoded in binary.阅读:关于如何以二进制编码负数的 2 的补码

In a signed char , assuming an 8-bit char width and 2's complement arch, a char can hold a value between -128 to +127.在有signed char中,假设 8 位 char 宽度和 2 的补码拱形,一个 char 可以保存 -128 到 +127 之间的值。

When you say:当你说:

char a = 130;

That's out of range.那是超出范围的。

130 as integer in 32-bit binary is: 00000000 00000000 00000000 10000010 130 as integer 在 32 位二进制中是: 00000000 00000000 00000000 10000010

In Hex, it's: 00 00 00 82 .在十六进制中,它是: 00 00 00 82 That's where your 82 value is coming from.这就是您的82值的来源。

When int(130) is cast to char it's basically just chopping off all by the last byte of bits: 10000010 .int(130)被强制转换为 char 时,它基本上只是被最后一个字节的位砍掉: 10000010

Hence char a = <binary:10000010> is -126 in 2's complement arithmetic.因此 char a = <binary:10000010>在 2 的补码算术中为-126

So when you assign short b = a , you're just assigned -126 to a short.因此,当您分配short b = a时,您只需将 -126 分配给 short。

In 2's complement architecture, when a negative number gets promoted to a larger type, it gets "sign extended".在 2 的补码体系结构中,当负数提升为更大的类型时,它会得到“符号扩展”。 That is, if the most significant bit of the signed char is 1 , then when it gets converted to short, the extra byte is prepended with leading 1 s as well.也就是说,如果有符号 char 的最高有效位是1 ,那么当它转换为 short 时,额外的字节也会以前导1开头。 That is, -126 as a 16-bit binary is: 11111111 10000010 or 0xff82也就是说, -126作为 16 位二进制是: 11111111 100000100xff82

Try declaring a as unsigned char and you should get different results.尝试a声明为unsigned char ,您应该会得到不同的结果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM