简体   繁体   中英

gcc and clang are giving different results

#include <stdio.h> 

int main(int argc, char *argv[]){

    int a;
    int *b = &a;
    a = 10;

    printf("%d %d\n", a, *b);

    int p = 20;
    int *q;
    *q = p;

    printf("%d %d\n", p, *q);

    int *t = NULL;

    return 0;
}

The above program when compiled with gcc gives segmentation fault on execution. But when compiled with clang, it executes without giving segmentation fault. Can anybody give the reason? gcc version is 9.3.0 and clang version is 10.0.0. OS is ubuntu 20.04

Problem:


The problem does not stem from the compiler, it's in the code itself, specifically *q = p , when you dereference the pointer, ie use * , you are accessing the value stored in the pointer which is the address where it points to, in this case the code is invalid because there is no memory assigned to q , it points to nowhere (at least to nowhere we'd like it to point). You can't store anything in the memory pointed by it because it doesn't exist or is some random memory location given by some garbage value that may be stored in q .

Behavior explained:


Given the above, and knowing that the value stored in q can be anything, you can and should expect different results in different compilers, different versions of the same compiler, or even the same compiler in the same machine but in different executions of the same program, at some point it may even be pointing to where you want it to, in a one in a trillion chance, and the program would then give you the expected result, therefore the behavior of your program is undefined .

Fixing it:


As already stated q needs to be pointing to some valid memory location before you can store a value in that memory location. You can do that by either allocating memory and assigning it to q or make q point to an existing valid memory address, eg :

int p;
int *q;
q = &p; // now q points to p, i.e. its value is the address of p

Now you can do:

*q = 10; // stores 10 in the memory address pointed by(stored in) q, the address of p
         // p is now 10

Or

int value = 20;
*q = value; // stores a copy of value in the address of the variable pointed by q, again p
            // p is now 20

Extra:


Note that if you use extra warning flags like -Wall or -Wextra among others, the compiler is likely to warn you about faulty constructs like the one you have.

I'm not an expert in compliers, but you are for sure triggering some Undefined Behaviour right here:

int *q;
*q=p;
printf("%d %d\n",p,*q);

You are dereferencing pointer q before initializing it. Reasons why this segfaults (or rather, doesn't segfault) can be few. The q could point to any memory location, it could for example hold old value of b after it was popped from stack in case of Clang, thus writing into non-restricted memory.

Not sure what your original intentions were with this piece of code, though.

The reason is that anything can happen if you use a variable which has not been initialized. If you compile this program with warnings enabled you should get a warning like

t.c:10:5: warning: ‘q’ is used uninitialized in this function [-Wuninitialized]
  *q = p;
  ~~~^~~

Before initialization, a variable can have any value which happens to be at the memory location where the variable is allocated. That's why the runtime behavior is unpredictable. The following picture illustrates the situation before the assignment of p :

在此处输入图片说明

Since we don't know where q point, we cannot dereferrence (follow) the pointer.

Can anybody give the reason?

The reason comes from the C language itself and how compilers were constructed.

First, the C language - your code invokes undefined behavior. Firstly, because using an uninitialized variable is undefined behavior , but obviously because you are applying * operator on an "invalid" pointer. Bottom line, there is undefined behavior .

Now, because there is undefined behavior, compilers can do what they want and generate code however they want. In short - there are no requirements .

Because of that, compiler writers do not care what compilers do in undefined behavior cases. Two compilers were constructed differently and act differently in this specific case when compiling this specific code. It was not deliberate - no one cares, so some random unrelated decisions in unrelated fields resulted in such behavior of both compilers.

The specific reasons why behaviors of both compilers are different, will come from inspecting the source code of both compilers. In this case, inspecting llvm with its documentation and gcc with gcc developer options will be helpful along the way.

The line *q=p; uses the value of q which is uninitialized; accessing an uninitialized variable is undefined behaviour, allowing the compilers to interpret both that line of code and anything preceding or following the line in any way at all.

It'd probably give different results for different levels of optimization as well.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM