简体   繁体   中英

comparison between signed and unsigned integer expressions and 0x80000000

I have the following code:

#include <iostream>

using namespace std;

int main()
{
    int a = 0x80000000;
    if(a == 0x80000000)
        a = 42;
    cout << "Hello World! :: " << a << endl;
    return 0;
}

The output is

Hello World! :: 42

so the comparison works. But the compiler tells me

g++ -c -pipe -g -Wall -W -fPIE  -I../untitled -I. -I../bin/Qt/5.4/gcc_64/mkspecs/linux-g++ -o main.o ../untitled/main.cpp
../untitled/main.cpp: In function 'int main()':
../untitled/main.cpp:8:13: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
     if(a == 0x80000000)
             ^

So the question is: Why is 0x80000000 an unsigned int? Can I make it signed somehow to get rid of the warning?

As far as I understand, 0x80000000 would be INT_MIN as it's out of range for positive a integer. but why is the compiler assuming, that I want a positive number?

I'm compiling with gcc version 4.8.1 20130909 on linux.

0x80000000 is an unsigned int because the value is too big to fit in an int and you did not add any L to specify it was a long.

The warning is issued because unsigned in C/C++ has a quite weird semantic and therefore it's very easy to make mistakes in code by mixing up signed and unsigned integers. This mixing is often a source of bugs especially because the standard library, by historical accident, chose to use an unsigned value for the size of containers ( size_t ).

An example I often use to show how subtle is the problem consider

// Draw connecting lines between the dots
for (int i=0; i<pts.size()-1; i++) {
    draw_line(pts[i], pts[i+1]);
}

This code seems fine but has a bug. In case the pts vector is empty pts.size() is 0 but, and here comes the surprising part, pts.size()-1 is a huge nonsense number (today often 4294967295, but depends on the platform) and the loop will use invalid indexes (with undefined behavior).

Here changing the variable to size_t i will remove the warning but is not going to help as the very same bug remains...

The core of the problem is that with unsigned values a < b-1 and a+1 < b are not the same thing even for very commonly used values like zero; this is why using unsigned types for non-negative values like container size is a bad idea and a source of bugs.

Also note that your code is not correct portable C++ on platforms where that value doesn't fit in an integer as the behavior around overflow is defined for unsigned types but not for regular integers. C++ code that relies on what happens when an integer gets past the limits has undefined behavior.

Even if you know what happens on a specific hardware platform note that the compiler/optimizer is allowed to assume that signed integer overflow never happens: for example a test like a < a+1 where a is a regular int can be considered always true by a C++ compiler.

It seems you are confusing 2 different issues: The encoding of something and the meaning of something . Here is an example: You see a number 97. This is a decimal encoding. But the meaning of this number is something completely different. It can denote the ASCII 'a' character, a very hot temperature, a geometrical angle in a triangle, etc. You cannot deduce meaning from encoding. Someone must supply a context to you (like the ASCII map, temperature etc).

Back to your question: 0x80000000 is encoding. While INT_MIN is meaning. There are not interchangeable and not comparable. On a specific hardware in some contexts they might be equal just like 97 and 'a' are equal in the ASCII context.

Compiler warns you about ambiguity in the meaning, not in the encoding. One way to give meaning to a specific encoding is the casting operator. Like (unsigned short)-17 or (student*)ptr;

On a 32 bits system or 64bits with back compatibility int and unsigned int have encoding of 32bits like in 0x80000000 but on 64 bits MIN_INT would not be equal to this number.

Anyway - the answer to your question: in order to remove the warning you must give identical context to both left and right expressions of the comparison. You can do it in many ways. For example:

(unsigned int)a == (unsigned int)0x80000000 or (__int64)a == (__int64)0x80000000 or even a crazy (char *)a == (char *)0x80000000 or any other way as long as you maintain the following rules:

  1. You don't demote the encoding (do not reduce the amount of bits it requires). Like (char)a == (char)0x80000000 is incorrect because you demote 32 bits into 8 bits
  2. You must give both the left side and the right side of the == operator the same context. Like (char *)a == (unsigned short)0x80000000 is incorrect an will yield an error/warning.

I want to give you another example of how crucial is the difference between encoding and meaning. Look at the code

char a = -7;  
bool b = (a==-7) ? true : false;

What is the result of 'b' ? The answer will shock you: it is undefined. Some compilers (typically Microsoft visual studio) will compile a program that b will get true while on Android NDK compilers b will get false . The reason is that Android NDK treats ' char ' type as ' unsigned char ', while Visual studio treats ' char ' as ' signed char '. So on Android phones the encoding of -7 actually has a meaning of 249 and is not equal to the meaning of (int)-7. The correct way to fix this problem is to specifically define 'a' as signed char:

 signed char a = -7;  
 bool b = (a==-7) ? true : false;

0x80000000 is considered unsigned per default. You can avoid the warning like this:

    if (a == (int)0x80000000)
        a=42;

Edit after a comment:

Another (perhaps better) way would be

    if ((unsigned)a == 0x80000000)
        a=42;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM