简体   繁体   English

C ++:编译器和链接器功能

[英]C++: Compiler and Linker functionality

I want to understand exactly which part of a program compiler looks at and which the linker looks at. 我想确切地了解程序编译器的哪个部分以及链接器所关注的内容。 So I wrote the following code: 所以我写了下面的代码:

#include <iostream>
using namespace std;
#include <string>

class Test {
private:
    int i;

public:
    Test(int val) {i=val ;}
    void DefinedCorrectFunction(int val);
    void DefinedIncorrectFunction(int val);
    void NonDefinedFunction(int val);

    template <class paramType>
    void  FunctionTemplate (paramType val) { i = val }
};

void Test::DefinedCorrectFunction(int val)
{
    i = val;
}

void Test::DefinedIncorrectFunction(int val)
{
    i = val
}

void main()
{
    Test testObject(1);
    //testObject.NonDefinedFunction(2);
    //testObject.FunctionTemplate<int>(2);

}

I have three functions: 我有三个功能:

  • DefinedCorrectFunction - This is a normal function declared and defined correctly. DefinedCorrectFunction - 这是正确声明和定义的普通函数。
  • DefinedIncorrectFunction - This function is declared correctly but the implementation is wrong (missing ;) DefinedIncorrectFunction - 此函数声明正确但实现错误(缺少;)
  • NonDefinedFunction - Only declaration. NonDefinedFunction - 仅声明。 No definition. 没有定义。
  • FunctionTemplate - A function template. FunctionTemplate - 功能模板。

    Now if I compile this code I get a compiler error for the missing ';'in DefinedIncorrectFunction. 现在如果我编译这段代码,我会在DefinedIncorrectFunction中找到缺少';'的编译器错误。
    Suppose I fix this and then comment out testObject.NonDefinedFunction(2). 假设我修复此问题然后注释掉testObject.NonDefinedFunction(2)。 Now I get a linker error. 现在我收到链接器错误。 Now comment out testObject.FunctionTemplate(2). 现在注释掉testObject.FunctionTemplate(2)。 Now I get a compiler error for the missing ';'. 现在我收到了缺少';'的编译器错误。

For function templates I understand that they are not touched by the compiler unless they are invoked in the code. 对于函数模板,我理解编译器不会触及它们,除非在代码中调用它们。 So the missing ';' 所以缺少';' is not complained by the compiler until I called testObject.FunctionTemplate(2). 在我调用testObject.FunctionTemplate(2)之前,编译器没有抱怨。

For the testObject.NonDefinedFunction(2), the compiler did not complain but the linker did. 对于testObject.NonDefinedFunction(2),编译器没有抱怨,但链接器没有。 For my understanding, all compiler cared was to know that is a NonDefinedFunction function declared. 据我所知,所有编译器都关心是知道这是一个声明的NonDefinedFunction函数。 It didn't care for the implementation. 它并不关心实施。 Then linker complained because it could not find the implementation. 然后链接器抱怨,因为它无法找到实现。 So far so good. 到现在为止还挺好。

Where I get confused is when compiler complained about DefinedIncorrectFunction. 我感到困惑的地方是编译器抱怨DefinedIncorrectFunction。 It didn't look for implementation of NonDefinedFunction but it went through the DefinedIncorrectFunction. 它没有寻找NonDefinedFunction的实现,但是它经历了DefinedIncorrectFunction。

So I'm little unclear as to what the compiler does exactly and what the linker does. 所以我不清楚编译器究竟做了什么以及链接器做了什么。 My understanding is linker links components with their calls. 我的理解是链接器链接组件与他们的调用。 So for when NonDefinedFunction is called it looked for the compiled implementation of NonDefinedFunction and complained. 因此,当调用NonDefinedFunction时,它会查找NonDefinedFunction的编译实现并抱怨。 But compiler didn't care about the implementation of NonDefinedFunction but it did for DefinedIncorrectFunction. 但是编译器并不关心NonDefinedFunction的实现,但是它确实用于DefinedIncorrectFunction。

I'd really appreciate if someone can explain this or provide some reference. 如果有人可以解释或提供一些参考,我真的很感激。

Thank you. 谢谢。

The function of the compiler is to compile the code that you have written and convert it into object files. 编译器的功能是编译编写的代码并将其转换为目标文件。 So if you have missed a ; 所以,如果你错过了一个; or used an undefined variable, the compiler will complain because these are syntax errors. 或使用未定义的变量,编译器会抱怨,因为这些是语法错误。

If the compilation proceeds without any hitch, the object files are produced. 如果编译顺利进行,则会生成目标文件 The object files have a complex structure but basically contain five things 目标文件具有复杂的结构,但基本上包含五件事

  1. Headers - The information about the file 标题 - 有关文件的信息
  2. Object Code - Code in machine language (This code cannot run by itself in most cases) 目标代码 - 机器语言代码(在大多数情况下,此代码不能单独运行)
  3. Relocation Information - What portions of code will need to have addresses changed when the actual execution occurs 重定位信息 - 实际执行时,代码的哪些部分需要更改地址
  4. Symbol Table - Symbols referenced by the code. 符号表 - 代码引用的符号。 They may be defined in this code, imported from other modules or defined by linker 它们可以在代码中定义,从其他模块导入或由链接器定义
  5. Debugging Info - Used by debuggers 调试信息 - 由调试器使用

The compiler compiles the code and fills the symbol table with every symbol it encounters. 编译器编译代码并使用它遇到的每个符号填充符号表。 Symbols refers to both variables and functions. 符号指的是变量和函数。 The answer to This question explains the symbol table. 这个问题的答案解释了符号表。

This contains a collection of executable code and data that the linker can process into a working application or shared library. 它包含链接器可以处理到工作应用程序或共享库中的可执行代码和数据的集合。 The object file has a data structure called a symbol table in it that maps the different items in the object file to names that the linker can understand. 目标文件中有一个称为符号表的数据结构,它将目标文件中的不同项映射到链接器可以理解的名称。

The point to note 需要注意的一点

If you call a function from your code, the compiler doesn't put the final address of the routine in the object file. 如果从代码中调用函数,则编译器不会将例程的最终地址放在目标文件中。 Instead, it puts a placeholder value into the code and adds a note that tells the linker to look up the reference in the various symbol tables from all the object files it's processing and stick the final location there. 相反,它将占位符值放入代码中,并添加一个注释,告诉链接器在其处理的所有目标文件中查找各种符号表中的引用,并将最终位置保留在那里。

The generated object files are processed by the linker that will fill out the blanks in symbol tables, link one module to the other and finally give the executable code which can be loaded by the loader. 生成的目标文件由链接器处理,链接器将填充符号表中的空白,将一个模块链接到另一个模块,最后提供可由加载程序加载的可执行代码。

So in your specific case - 所以在你的具体情况下 -

  1. DefinedIncorrectFunction() - The compiler gets the definition of the function and begins compiling it to make the object code and insert appropriate reference into Symbol Table. DefinedIncorrectFunction() - 编译器获取函数的定义并开始编译它以生成目标代码并在符号表中插入适当的引用。 Compilation fails due to syntax error, so Compiler aborts with an error. 由于语法错误,编译失败,因此编译器因错误而中止。
  2. NonDefinedFunction() - The compiler gets the declaration but no definition so it adds an entry to symbol table and flags the linker to add appropriate values (Since linker will process a bunch of object files, it is possible this definitionis present in some other object file). NonDefinedFunction() - 编译器获取声明但没有定义,因此它向符号表添加一个条目并标记链接器以添加适当的值(因为链接器将处理一堆目标文件,所以此定义可能存在于某些其他目标文件中)。 In your case you do not specify any other file, so the linker aborts with an undefined reference to NonDefinedFunction error because it can't find the reference to the concerned symbol table entry. 在您的情况下,您没有指定任何其他文件,因此链接器以undefined reference to NonDefinedFunction错误的undefined reference to NonDefinedFunction中止,因为它无法找到对相关符号表条目的引用。

To understand it further lets say your code is structured as following 为了进一步理解它,可以说你的代码结构如下

File- try.h File- try.h

#include<string>
#include<iostream>


class Test {
private:
    int i;

public:
    Test(int val) {i=val ;}
    void DefinedCorrectFunction(int val);
    void DefinedIncorrectFunction(int val);
    void NonDefinedFunction(int val);

    template <class paramType>
    void  FunctionTemplate (paramType val) { i = val; }
};

File try.cpp 文件try.cpp

#include "try.h"


void Test::DefinedCorrectFunction(int val)
{
    i = val;
}

void Test::DefinedIncorrectFunction(int val)
{
    i = val;
}

int main()
{

    Test testObject(1);
    testObject.NonDefinedFunction(2);
    //testObject.FunctionTemplate<int>(2);
    return 0;
}

Let us first only copile and assemble the code but not link it 让我们首先只复制并组装代码但不链接它

$g++ -c try.cpp -o try.o
$

This step proceeds without any problem. 该步骤没有任何问题。 So you have the object code in try.o. 所以你在try.o中有目标代码。 Let's try and link it up 让我们尝试将其链接起来

$g++ try.o
try.o: In function `main':
try.cpp:(.text+0x52): undefined reference to `Test::NonDefinedFunction(int)'
collect2: ld returned 1 exit status

You forgot to define Test::NonDefinedFunction. 你忘了定义Test :: NonDefinedFunction。 Let's define it in a separate file. 我们在一个单独的文件中定义它。

File- try1.cpp File- try1.cpp

#include "try.h"

void Test::NonDefinedFunction(int val)
{
    i = val;
}

Let us compile it into object code 让我们将其编译成目标代码

$ g++ -c try1.cpp -o try1.o
$

Again it is successful. 它再次成功。 Let us try to link only this file 让我们尝试仅链接此文件

$ g++ try1.o
/usr/lib/gcc/x86_64-redhat-linux/4.4.5/../../../../lib64/crt1.o: In function `_start':
(.text+0x20): undefined reference to `main'
collect2: ld returned 1 exit status

No main so won';t link!! 没有主要因此赢得了't链接!!

Now you have two separate object codes that have all the components you need. 现在,您有两个单独的对象代码,其中包含您需要的所有组件。 Just pass BOTH of them to linker and let it do the rest 只需将它们两个传递给链接器,然后让它完成剩下的工作

$ g++ try.o try1.o
$

No error!! 没错! This is because the linker finds definitions of all the functions (even though it is scattered in different object files) and fills the blanks in object codes with appropriate values 这是因为链接器找到所有函数的定义(即使它分散在不同的目标文件中)并使用适当的值填充目标代码中的空白

I believe this is your question: 我相信这是你的问题:

Where I get confused is when compiler complained about DefinedIncorrectFunction. 我感到困惑的地方是编译器抱怨DefinedIncorrectFunction。 It didn't look for implementation of NonDefinedFunction but it went through the DefinedIncorrectFunction. 它没有寻找NonDefinedFunction的实现,但是它经历了DefinedIncorrectFunction。

The compiler tried to parse DefinedIncorrectFunction (because you provided a definition in this source file) and there was a syntax error (missing semicolon). 编译器试图解析DefinedIncorrectFunction (因为你在这个源文件中提供了一个定义)并且存在语法错误(缺少分号)。 On the other hand, the compiler never saw a definition for NonDefinedFunction because there simply was no code in this module. 另一方面,编译器从未看到NonDefinedFunction的定义,因为此模块中根本没有代码。 You might have provided a definition of NonDefinedFunction in another source file, but the compiler doesn't know that. 您可能在另一个源文件中提供了NonDefinedFunction的定义,但编译器不知道。 The compiler only looks at one source file (and its included header files) at a time. 编译器一次只查看一个源文件(及其包含的头文件)。

Say you want to eat some soup, so you go to a restaurant. 说你想吃点汤,所以你去餐馆。

You search the menu for soup. 你在菜单上搜索汤。 If you don't find it in the menu, you leave the restaurant. 如果您没有在菜单中找到它,请离开餐厅。 (kind of like a compiler complaining it couldn't find the function) If you find it, what do you do? (有点像编译器抱怨它无法找到功能)如果你找到它,你会怎么做?

You call the waiter to go get you some soup. 你打电话给服务员去给你点汤。 However, just because it's in the menu, doesn't mean that they also have it in the kitchen. 然而,仅仅因为它在菜单中,并不意味着它们也在厨房里。 Could be an outdated menu, it could be that someone forgot to tell the chef that he's supposed to make soup. 可能是一个过时的菜单,可能是有人忘了告诉厨师他应该做汤。 So again, you leave. 再说一次,你离开了。 (like an error from the linker that it couldn't find the symbol) (就像来自链接器的错误,它找不到符号)

Compiler checks that the source code is language conformant and adheres to the semantics of the language. 编译器检查源代码是否符合语言并遵守语言的语义。 The output from compiler is object code. 编译器的输出是目标代码。

Linker links the different object modules together to form a exe. 链接器将不同的对象模块链接在一起以形成exe。 The definitions of functions are located in this phase and the appropriate code to call them is added in this phase. 函数的定义位于此阶段,并在此阶段添加调用它们的相应代码。

The compiler compiles code in the form of translation units . 编译器以翻译单元的形式编译代码。 It will compile all the code that is included in a source .cpp file, 它将编译源.cpp文件中包含的所有代码,
DefinedIncorrectFunction() is defined in your source file, So compiler checks it for language validity. DefinedIncorrectFunction()在源文件中定义,因此编译器会检查它的语言有效性。
NonDefinedFunction() does have any definition in the source file so the compiler does not need to compile it, if the definition is present in some other source file, the function will be compiled as a part of that translation unit and further the linker will link to it, if at linking stage the definition is not found by the linker then it will raise a linking error. NonDefinedFunction()在源文件中有任何定义,因此编译器不需要编译它,如果定义存在于某个其他源文件中,该函数将被编译为该转换单元的一部分,并且链接器将进一步链接对于它,如果在链接阶段链接器找不到定义,那么它将引发链接错误。

What the compiler does, and what the linker does, depends on the implementation: a legal implementation could just store the tokenized source in the “compiler”, and do everything in the linker. 编译器的作用以及链接器的作用取决于实现:合法实现可以将标记化的源存储在“编译器”中,并在链接器中执行所有操作。 Modern implementations do put off more and more to the linker, for better optimization. 现代实现确实会越来越多地推断链接器,以实现更好的优化。 And many early implementations of templates didn't even look the template code until link time, other than matching braces enough to know where the template ended. 许多模板的早期实现甚至在链接时间之前都没有查看模板代码,除了匹配大括号足以知道模板结束的位置。 From a user point of view, you're more interested in whether the error “requires a diagnostic” (which can be emitted by the compiler or the linker) or is undefined behavior. 从用户的角度来看,您对错误“是否需要诊断”(可由编译器或链接器发出)或未定义的行为更感兴趣。

In the case of DefinedIncorrectFunction , you have provides source text which the implementation is required to parse. DefinedIncorrectFunction的情况下,您提供了需要解析实现的源文本。 That text contains a error for which a diagnostic is required. 该文本包含需要诊断的错误。 In the case of NonDefinedFunction : if the function is used, failure to provide a definition (or providing more than one definition) in the complete program is a violation of the one definition rule, which is undefined behavior. 对于NonDefinedFunction :如果使用该函数,则无法在完整程序中提供定义(或提供多个定义)违反了一个定义规则,即未定义的行为。 No diagnostic is required (but I can't imagine an implementation that didn't provide one for a missing definition of a function that was used). 不需要诊断(但是我无法想象一个实现没有为缺少的函数定义提供一个实现)。

In practice, errors which can be easily detected simply by examining the text input of a single translation unit are defined by the standard to “require a diagnostic”, and will be detected by the compiler. 在实践中,简单地通过检查单个翻译单元的文本输入可以容易地检测到的错误由标准定义为“需要诊断”,并且将由编译器检测。 Errors which cannot be detected by the examination of a single translation unit (eg a missing definition, which might be present in a different translation unit) are formally undefined behavior—in many cases, the errors can be detected by the linker, and in such cases, implementations will in fact emit an error. 通过检查单个翻译单元无法检测到的错误(例如,缺失的定义,可能存在于不同的翻译单元中)是正式未定义的行为 - 在许多情况下,错误可以由链接器检测到,并且在这种情况下例如,实现实际上会发出错误。

This is somewhat modified in cases like inline functions, where you're allowed to repeat the definition in each translation unit, and extremely modified by templates, since many errors cannot be detected until instantiation. 这在内联函数的情况下有所修改,在内联函数中,您可以在每个转换单元中重复定义,并且由模板进行极大修改,因为在实例化之前无法检测到许多错误。 In the case of templates, the standard leaves implementations a great deal of freedom: at the least, the compiler must parse the template enough to determine where the template ends. 对于模板,标准使实现具有很大的自由度:至少,编译器必须足够解析模板以确定模板的结束位置。 The standard added things like typename , however, to allow much more parsing before instantiation. 然而,标准添加了诸如typename东西,以便在实例化之前允许更多的解析。 In dependent contexts, however, some errors cannot possibly be detected before instantiation, which may take place at compilation time or at link time—early implementations favored link time instantiation; 然而,在依赖上下文中,在实例化之前不可能检测到一些错误,这可能发生在编译时或链接时 - 早期实现有利于链接时实例化; compile time instantiation dominates today, and is used by VC++ and g++. 编译时实例化在今天占主导地位,并由VC ++和g ++使用。

The missing semi-colon is a syntax error and therefore the code should not compile. 缺少的分号是语法错误,因此代码不应该编译。 This might happen even in a template implementation. 即使在模板实现中也可能发生这种情况。 Essentially, there is a parsing stage and whilst it is obvious to a human how to "fix and recover" a compiler doesn't have to do that. 从本质上讲,有一个解析阶段,虽然对人类来说很明显如何“修复和恢复”编译器不必这样做。 It can't just "imagine the semi-colon is there because that's what you meant" and continue. 它不能只是“想象分叉是否存在,因为这就是你的意思”并继续。

A linker looks for function definitions to call where they are required. 链接器查找函数定义以在需要它们的地方调用。 It isn't required here so there is no complaint. 这里不需要,所以没有投诉。 There is no error in this file as such, as even if it were required, it might not be implemented in this particular compilation unit. 这个文件中没有错误,因为即使它是必需的,它也可能不会在这个特定的编译单元中实现。 The linker is responsible for collecting together different compilation units, ie "linking" them. 链接器负责收集不同的编译单元,即“链接”它们。

Ah, but you could have NonDefinedFunction(int) in another compilation unit. 啊,但你可以在另一个编译单元中使用NonDefinedFunction(int)。

The compiler produces some output for the linker that basically says the following (among other things): 编译器为链接器生成一些输出,基本上说明以下内容(除其他外):

  • Which symbols (functions/variables/etc) are defined. 定义了哪些符号(函数/变量/等)。
  • Which symbols are referenced but undefined. 引用了哪些符号但未定义。 In this case the linker needs to resolve the references by searching through the other modules being linked. 在这种情况下,链接器需要通过搜索链接的其他模块来解析引用。 If it can't, you get a linker error. 如果不能,则会出现链接器错误。

The linker is there to link in code defined (possibly) in external modules - libraries or object files you will use together with this particular source file to generate the complete executable. 链接器用于链接在外部模块中定义的代码(可能) - 您将与此特定源文件一起使用的库或目标文件,以生成完整的可执行文件。 So, if you have a declaration but no definition, your code will compile because the compiler knows the linker might find the missing code somewhere else and make it work. 因此,如果您有一个声明但没有定义,您的代码将编译,因为编译器知道链接器可能在其他地方找到丢失的代码并使其工作。 Therefore, in this case you will get an error from the linker, not the compiler. 因此,在这种情况下,您将从链接器而不是编译器获得错误。

If, on the other hand, there's a syntax error in your code, the compiler can't even compile and you will get an error at this stage. 另一方面,如果代码中存在语法错误,则编译器甚至无法编译,您将在此阶段遇到错误。 Macros and templates may behave a bit differently yet, not causing errors if they are not used (templates are about as much as macros with a somewhat nicer interface), but it also depends on the error's gravity. 宏和模板的行为可能有点不同,如果不使用它们就不会导致错误(模板与具有更好界面的宏一样多),但它也取决于错误的重力。 If you mess up so much that the compiler can't figure it out where the templated/macro code ends and regular code starts, it won't be able to compile. 如果你陷入困境,编译器无法弄清楚模板化/宏代码结束和常规代码启动的地方,它将无法编译。

With regular code, the compiler must compile even dead code (code not referenced in your source file) because someone might want to use that code from another source file, by linking your .o file to his code. 使用常规代码,编译器必须编译甚至死代码(源文件中未引用的代码),因为有人可能希望通过将.o文件链接到他的代码来使用来自其他源文件的代码。 Therefore non-templated/macro code must be syntactically correct even if it is not directly used in the same source file. 因此,非模板化/宏代码必须在语法上正确,即使它不直接在同一源文件中使用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM