简体   繁体   English

c:将类型转换为无符号short类型的char值

[英]c: type casting char values into unsigned short

starting with a pseudo-code snippet: 从伪代码段开始:

char a = 0x80;
unsigned short b;
b = (unsigned short)a;
printf ("0x%04x\r\n", b); // => 0xff80

to my current understanding "char" is by definition neither a signed char nor an unsigned char but sort of a third type of signedness. 根据我目前的理解,“ char”从定义上来说既不是带符号的char也不是未签名的char,而是某种第三种签名。

how does it come that it happens that 'a' is first sign extended from (maybe platform dependent) an 8 bits storage to (a maybe again platform specific) 16 bits of a signed short and then converted to an unsigned short? 怎么会发生“ a”的第一个符号从一个8位存储扩展到(可能依赖于平台的情况)到一个有符号的短缺的16位(然后可能又是特定于平台的情况),然后转换为无符号的短缺的情况?

is there ac standard that determines the order of expansion? 是否有确定扩展顺序的交流标准?

does this standard guide in any way on how to deal with those third type of signedness that a "pure" char (i called it once an X-char, x for undetermined signedness) so that results are at least deterministic? 该标准是否以任何方式指导了如何处理“纯”字符(我称它为X字符,x表示不确定的字符)的第三种类型的签名,从而使结果至少具有确定性?

PS: if inserting an "(unsigned char)" statement in front of the 'a' in the assignment line, then the result in the printing line is indeed changed to 0x0080. PS:如果在赋值行的'a'前面插入“(unsigned char)”语句,则打印行的结果确实变为0x0080。 thus only two type casts in a row will provide what might be the intended result for certain intentions. 因此,连续只有两种类型的转换会提供某些意图的预期结果。

The type char is not a "third" signedness. char类型不是“第三”签名。 It is either signed char or unsigned char , and which one it is is implementation defined. 它可以是signed charunsigned char ,并且是实现定义的那个。

This is dictated by section 6.2.5p15 of the C standard : 这由C标准的 6.2.5p15节规定

The three types char , signed char , and unsigned char are collectively called the character types . charsigned charunsigned char这三种类型统称为字符类型 The implementation shall define char to have the same range, representation, and behavior as either signed char or unsigned char . 实现应将char定义为与signed charunsigned char具有相同的范围,表示形式和行为。

It appears that on your implementation, char is the same as signed char , so because the value is negative and because the destination type is unsigned it must be converted. 看来在您的实现中, charsigned char相同,因此,因为该值为负数,并且因为目标类型是unsigned,所以必须对其进行转换。

Section 6.3.1.3 dictates how conversion between integer types occur: 第6.3.1.3节规定了整数类型之间如何进行转换:

1 When a value with integer type is converted to another integer type other than _Bool ,if the value can be represented by the new type, it is unchanged. 1将整数类型的值转换为_Bool以外的其他整数类型时,如果该值可以用新类型表示,则该值不变。

2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type. 2否则, 如果新类型是无符号的,则通过重复添加或减去比新类型可表示的最大值多一个值来转换值,直到该值在新类型的范围内为止。

3 Otherwise, the new type is signed and the value cannot be represented in it; 3否则,将对新类型进行签名,并且无法在其中表示值; either the result is implementation-defined or an implementation-defined signal is raised. 结果是实现定义的,还是引发实现定义的信号。

Since the value 0x80 == -128 cannot be represented in an unsigned short the conversion in paragraph 2 occurs. 由于值0x80 == -128不能用unsigned short形式表示,因此发生了第2段中的转换。

char has implementation-defined signedness. char具有实现定义的签名。 It is either signed or unsigned, depending on compiler. 它是签名的还是未签名的,取决于编译器。 It is true, in a way, that char is a third character type, see this . 从某种意义上说, char是第三种字符类型,请参见this char has an indeterministic (non-portable) signedness and therefore should never be used for storing raw numbers. char具有不确定的(不可移植的)签名,因此永远不应用于存储原始数字。

But that doesn't matter in this case. 但这并不重要。

  • On your compiler, char is signed. 在您的编译器上, char被签名。
  • char a = 0x80; forces a conversion from the type of 0x80 , which is int , to char , in a compiler-specific manner. 强制以特定于编译器的方式将类型为0x80 int转换为char Normally on 2's complement systems, that will mean that the char gets the value -128 , as seems to be the case here. 通常在2的补码系统上,这将意味着char获得值-128 ,在这里似乎是这种情况。
  • b = (unsigned short)a; forces a conversion from char to unsigned short 1) . 强制从charunsigned short 1)的转换。 C17 6.3.1.3 Signed and unsigned integers then says: C17 6.3.1.3有符号和无符号整数然后说:

    Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type. 否则,如果新类型是无符号的,则通过重复添加或减去比新类型中可以表示的最大值多一个值来转换该值,直到该值在新类型的范围内为止。

    One more than the maximum value would be 65536 . 比最大值多一的6553665536 So you can think of this as -128 + 65536 = 65408 . 因此,您可以将其视为-128 + 65536 = 65408

  • The unsigned hex representation of 65408 is 0xFF80 . 65408的无符号十六进制表示形式是0xFF80 No sign extension takes place anywhere! 没有标志扩展发生在任何地方!


1) The cast is not needed. 1)不需要强制转换。 When both operands of = are arithmetic types, as in this case, the right operand is implicitly converted to the type of the right operand (C17 6.5.16.1 §2). =两个操作数都是算术类型时,在这种情况下,右操作数将隐式转换为右操作数的类型(C17 6.5.16.1§2)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM