简体   繁体   中英

What is the rule for C to cast between short and int?

I'm confused when using C to cast between short and int. I assume short is 16-bit and int is 32-bit. I tested with below code:

unsigned short a = 0xFFFF;
signed short b = 0xFFFF;

unsigned int u16tou32 = a;
unsigned int s16tou32 = b;
signed int u16tos32 = a;
signed int s16tos32 = b;

printf("%u %u %d %d\n", u16tou32, s16tou32, u16tou32, s16tou32);

What I got is:

  • u16tou32: 65535
  • s16tou32: 4294967295
  • u16tos32: 65535
  • s16tos32: -1

What I am confused with is the conversion between s16 to u32, as well as u16 to s32. Seems like s16 to u32 is doing a "sign extension", while u16 to s32 is not. What exactly is the rule behind this? Also is this implementation-dependent? Is it safe to do this type of casting in C, or should I use bit manipulation myself to avoid unexpected results?

Anytime an integer type is being converted to a different integer type it falls through a deterministic pachinko machine of rules as dictated by the standard and on one occasion, the implementation.

The general overview on value-qualification:

C99 6.3.1.1-p2

If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int . These are called the integer promotions. All other types are unchanged by the integer promotions.

That said, lets look at your conversions. The signed-short to unsigned int is covered by the following, since the value being converted falls outside the unsigned int domain:

C99 6.3.1.3-p2

Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

Which basically means "add UINT_MAX+1". On your machine, UINT_MAX is 4294967295, therefore, this becomes

-1 + 4294967295 + 1 = 4294967295

Regarding your unsigned short to signed int conversion, that is covered by the regular value-quaified promotion. Specifically:

C99 6.3.1.3-p1

When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.

In other words, because the value of your unsigned short falls within the coverable domain of signed int , there is nothing special done and the value is simply saved.

And finally, as mentioned in general-comment above, something special happens to your declaration of b

signed short b = 0xFFFF;

The 0xFFFF in this case is a signed integer. The decimal value is 65535. However, that value is not representable by a signed short so yet-another conversion happens, one that perhaps you weren't aware of:

C99 6.3.1.3-p3

Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

In other words, your implementation chose to store it as (-1) , but you cannot rely on that on a different implementation.

What's happening here is that the right-hand-side of the argument is first extended from 16 to 32 bits, and the conversion to the left-hand-side type only happens at assignment. This means that if the right-hand-side is signed, then it will be sign-extended when it's converted to 32 bits, and likewise if it's unsigned then it will just be zero-padded.

If you're careful with your casts then there shouldn't be any problem—but unless you're doing something super performance-intensive then the extra couple of bitwise operations shouldn't hurt anything.

On another note, if you're doing anything where you're assuming certain bit-widths for different integer types, you should really be explicit and use the types defined in stdint.h . I just recently got bit by this while porting (someone else's) code from *nix to Windows, as the Visual C++ compiler uses a different convention for integer sizes (LLP64) than that on any other x64 or power-7 compiler I've used (LP64). In short, if you want 32 bits, you're better off saying it explicitly with a type like uint32_t .


So this will always hold when such conversion happens in C? defined by C standard? – Jun

Yes, it should always hold. Relevant quotes (with links) from the C99 standard: "The integer promotions preserve value including sign." When handling usual arithmetic type conversions: "... the integer promotions are performed on both operands. Then the following rules are applied to the promoted operands..."

As stated in the question, assume 16-bit short and 32-bit int .

unsigned short a = 0xFFFF;

This initializes a to 0xFFFF , or 65535 . The expression 0xFFFF is of type int ; it's implicitly converted to unsigned short , and the value is preserved.

signed short b = 0xFFFF;

This is a little more complicated. Again, 0xFFFF is of type int . It's implicitly converted to signed short -- but since the value is outside the range of signed short the conversion cannot preserve the value.

Conversion of an integer to a signed integer type, when the value can't be represented, yields an implementation-defined value. In principle, the value of b could be anything between -32768 and +32767 inclusive. In practice, it will almost certainly be -1 . I'll assume for the rest of this that the value is -1 .

unsigned int u16tou32 = a;

The value of a is 0xFFFF , which is converted from unsigned short to unsigned int . The conversion preserves the value.

unsigned int s16tou32 = b;

The value of b is -1 . It's converted to unsigned int , which clearly cannot store a value of -1 . Conversion of an integer to an unsigned integer type (unlike conversion to a signed type) is defined by the language; the result is reduced modulo MAX + 1 , where MAX is the maximum value of the unsigned type. In this case, the value stored in s16tou32 is UINT_MAX - 1 , or 0xFFFFFFFF .

signed int u16tos32 = a;

The value of a , 0xFFFF , is converted to signed int . The value is preserved.

signed int s16tos32 = b;

The value of b , -1 , is converted to signed int . The value is preserved.

So the stored values are:

a == 0xFFFF (65535)
b == -1     (not guaranteed, but very likely)
u16tou32 == 0xFFFF (65535)
s16tou32 == 0xFFFFFFFF (4294967295)
u16tos32 == 0xFFFF (65535)
s16tos32 == -1

To summarize the integer conversion rules:

If the target type can represent the value, the value is preserved.

Otherwise, if the target type is unsigned, the value is reduced modulo MAX+1 , which is equivalent to discarding all but the low-order N bits. Another way to describe this is that the value MAX+1 is repeatedly added to or subtracted from the value until you get a result that's in the range (this is actually how the C standard describes it). Compilers don't actually generate code to do this repeated addition or subtraction; they just have to get the right result.

Otherwise, the target type is signed and cannot represent the value; the conversion yields an implementation-defined value. In almost all implementations, the result discards all but the low-order N bits using a two's-complement representation. (C99 added a rule for this case, permitting an implementation-defined signal to be raised instead. I don't know of any compiler that does this.)

This is an unsigned short representation of the number 65535:

unsigned short a = 0xFFFF;

This is a signed short representation of the number -1:

signed short b = 0xFFFF;

Simple promotion from unsigned short to unsigned int, so u16tou32 is a unsigned int representation of the number 65535:

unsigned int u16tou32 = a;

b (value of -1) is promoted to int. Thus its hex representation would be 0xFFFFFFFF. It is then cast to unsigned, so is a representation of the number 4294967295:

unsigned int s16tou32 = b;

Promotion from unsigned short to unsigned int has value 65535. It is then case to signed int, which will also be a representation of the number 65535:

signed int u16tos32 = a;

Simple promotion of signed short to signed int, so s16tos32 is also a representation of the number -1:

signed int s16tos32 = b;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM