简体   繁体   中英

What is the difference between literals and variables in C (signed vs unsigned short ints)?

I have seen the following code in the book Computer Systems: A Programmer's Perspective, 2/E . This works well and creates the desired output. The output can be explained by the difference of signed and unsigned representations.

#include<stdio.h>
int main() {
    if (-1 < 0u) {
        printf("-1 < 0u\n");
    }
    else {
        printf("-1 >= 0u\n");
    }
    return 0;
}

The code above yields -1 >= 0u , however, the following code which shall be the same as above, does not! In other words,

#include <stdio.h>

int main() {

    unsigned short u = 0u;
    short x = -1;
    if (x < u)
        printf("-1 < 0u\n");
    else
        printf("-1 >= 0u\n");
    return 0;
}

yields -1 < 0u . Why this happened? I cannot explain this.

Note that I have seen similar questions like this , but they do not help.

PS. As @Abhineet said, the dilemma can be solved by changing short to int . However, how can one explains this phenomena? In other words, -1 in 4 bytes is 0xff ff ff ff and in 2 bytes is 0xff ff . Given them as 2s-complement which are interpreted as unsigned , they have corresponding values of 4294967295 and 65535 . They both are not less than 0 and I think in both cases, the output needs to be -1 >= 0u , ie x >= u .

A sample output for it on a little endian Intel system:

For short:

-1 < 0u
u =
 00 00
x =
 ff ff

For int:

-1 >= 0u
u =
 00 00 00 00
x =
 ff ff ff ff

The code above yields -1 >= 0u

All integer literals (numeric constansts) have a type and therefore also a signedness. By default, they are of type int which is signed. When you append the u suffix, you turn the literal into unsigned int .

For any C expression where you have one operand which is signed and one which is unsiged, the rule of balacing (formally: the usual arithmetic conversions ) implicitly converts the signed type to unsigned.

Conversion from signed to unsigned is well-defined (6.3.1.3):

Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

For example, for 32 bit integers on a standard two's complement system, the max value of an unsigned integer is 2^32 - 1 (4294967295, UINT_MAX in limits.h). One more than the maximum value is 2^32 . And -1 + 2^32 = 4294967295 , so the literal -1 is converted to an unsigned int with the value 4294967295 . Which is larger than 0.


When you switch types to short however, you end up with a small integer type . This is the difference between the two examples. Whenever a small integer type is part of an expression, the integer promotion rule implicitly converts it to a larger int (6.3.1.1):

If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.

If short is smaller than int on the given platform (as is the case on 32 and 64 bit systems), any short or unsigned short will therefore always get converted to int , because they can fit inside one.

So for the expression if (x < u) , you actually end up with if((int)x < (int)u) which behaves as expected (-1 is lesser than 0).

You're running into C's integer promotion rules.

Operators on types smaller than int automatically promote their operands to int or unsigned int . See comments for more detailed explanations. There is a further step for binary (two-operand) operators if the types still don't match after that (eg unsigned int vs. int). I won't try to summarize the rules in more detail than that. See Lundin's answer .

This blog post covers this in more detail, with a similar example to yours: signed and unsigned char. It quotes the C99 spec:

If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.


You can play around with this more easily on something like godbolt, with a function that returns one or zero . Just look at the compiler output to see what ends up happening.

#define mytype short

int main() {
    unsigned mytype u = 0u;
    mytype x = -1;
    return (x < u);
}

Other than what you seem to assume , this is not a property of the particular width of the types, here 2 byte versus 4 bytes, but a question of the rules that are to be applied. The integer promotion rules state that short and unsigned short are converted to int on all platforms where the corresponding range of values fit into int . Since this is the case here, both values are preserved and obtain the type int . -1 is perfectly representable in int as is 0 . So the test results in -1 is smaller than 0 .

In the case of testing -1 against 0u the common conversion choses the unsigned type as a common type to which both are converted. -1 converted to unsigned is the value UINT_MAX , which is larger than 0u .

This is a good example, why you should never use "narrow" types to do arithmetic or comparison. Only use them if you have a sever size constraint. This will rarely be the case for simple variables, but mostly for large arrays where you can really gain from storing in a narrow type.

0u is not unsigned short , it's unsigned int .

Edit:: The explanation to the behavior, How comparison is performed ?

As answered by Jens Gustedt,

This is called "usual arithmetic conversions" by the standard and applies whenever two different integer types occur as operands of the same operator.

In essence what is does

if the types have different width (more precisely what the standard calls conversion rank) then it converts to the wider type if both types are of same width, besides really weird architectures, the unsigned of them wins Signed to unsigned conversion of the value -1 with whatever type always results in the highest representable value of the unsigned type.

The more explanatory blog written by him could be found here .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM