简体   繁体   中英

Concept: How are declarations linked to apropriate definitions

How exactly does a header file or any forward declarations know which definition it is referring to?

I understand that .cpp files are compiled independently, and we need a header file or forward declaration to access members of another .cpp file. But when we declare a member, we don't explicitly tell the compiler where to get the definition from.

Here is a case that I can think of: Say I have two cpp files 'one.cpp' and 'two.cpp'. Both 'one.cpp' and 'two.cpp' have a member 'int func(int x)' that have different implementations (but have the exact name and format). If we have a header file or declaration of this function somewhere outside these two files, how does the compiler know which definition to take?

Resolving a definition for each declaration is performed by the linker. Each declaration must have a unique definition. While a function may be declared multiple times, each function must be defined exactly once in all compilation units which are to be linked. If there is more than one definition for a function with the same signature, then the linker will throw an error and refuse to finish building the executable.

I suggest that you create the example files that you described and try to build them into a single executable. You will see the error that I describe here.

If we have a header file or declaration of this function somewhere outside these two files, how does the compiler know which definition to take?

It's the linker which takes one or more object files or libraries as input and combines them to produce an executable file. In doing so, it resolves references to external symbols ie it looks for definition of all external functions and global variables both from other '.obj' files and external libraries, assigns final addresses to procedures/functions and variables, and revises code and data to reflect new addresses.

Let's consider the example you are mentioning in question:

Say I have two cpp files 'one.cpp' and 'two.cpp'. Both 'one.cpp' and 'two.cpp' have a member 'int func(int x)' that have different implementations....

Say, one.cpp :

int func(int x)
{
        return x+1;
}

and two.cpp :

int func(int x)
{
        return x+2;
}

and a header file declaring func() function, say myinc.h :

int func(int x);

and main() which is calling func() , say main.cpp :

#include <iostream>
#include <myinc.h>

int main()
{
        int res;
        res = func(10);
        std::cout << res << std::endl;
        return 0;
}

I can create the object file of main.cpp because an object file can refer to symbols that are not defined.

>g++ -I . -c main.cpp

Now let's examine the object file main.o using nm command, the output is:

Symbols from main.o:

Name                  Value           Class        Type         Size             Line  Section

_GLOBAL__I_main     |0000000000000078|   t  |              FUNC|0000000000000015|     |.text    
_Z41__static_initialization_and_destruction_0ii|0000000000000038|   t  |              FUNC|0000000000000040|     |.text    
_Z4funci            |                |   U  |            NOTYPE|                |     |*UND*
.......
.......<SNIP>

The func() function Class is U , which means Undefined. The compiler doesn't mind if it could not find the definition of a particular function, it would just assume that the function was defined in another file.

The linker, on the other hand, may look at multiple files and try to find references to the functions that weren't mentioned.

So, when we try creating an executable from the object files one.o , two.o and main.o :

>g++ two.o one.o main.o -o outexe
one.o: In function `func(int)':
one.cpp:(.text+0x0): multiple definition of `func(int)'
two.o:two.cpp:(.text+0x0): first defined here
collect2: ld returned 1 exit status

Here you can see the linker is throwing multiple definition error for func() because it find two definitions of func() .

There is a one definition rule in c++, which states that:

In the entire program, an object or non-inline function cannot have more than one definition; if an object or function is used, it must have exactly one definition. You can declare an object or function that is never used, in which case you don't have to provide a definition. In no event can there be more than one definition.

So, the program is violating the ODR because it contains two definitions of the same function declaration.

If we do not provide either one.o or two.o object file to linker, means, if we just provide one definition of func() it will generate exe:

>g++ one.o main.o -o outexe

and if we examine outexe , we get:

Symbols from outexe:

Name                  Value           Class        Type         Size             Line  Section

.....<SNIP>    
_Z41__static_initialization_and_destruction_0ii|000000000040082c|   t  |              FUNC|0000000000000040|     |.text
_Z4funci            |00000000004007e4|   T  |              FUNC|000000000000000f|     |.text
.......
.......<SNIP>

The symbol func() is of Type - FUNC and Class - T , which means - The symbol is in the text (code) section .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM