简体   繁体   English

编译过程中的链接实际上做了什么?

[英]What does linking in the compilation process actually do?

As I understand it the GCC compiler performs four steps when I compile a C program. 据我所知,GCC编译器在编译C程序时执行了四个步骤。

  1. Preprocessing - C code (*.c) with macros to C code without macros (*.c) 预处理 - 带有宏的C代码(* .c)到没有宏的C代码(* .c)
  2. Compiling - C code (*.c) to Assembly language (*.s) 编译 - C代码(* .c)到汇编语言(* .s)
  3. Assembling - Assembly language (*.s) to Object code (*.o) 汇编 - 汇编语言(* .s)到目标代码(* .o)
  4. Linking - Object code (*.o) to executable (*) 链接 - 对象代码(* .o)到可执行文件(*)

The first three steps make perfect sense to me, but I am still confused as to what linking actually does. 前三个步骤对我来说非常有意义,但我仍然对链接实际上做了什么感到困惑。

After step three why can't I run the *.o file? 在第三步之后为什么我不能运行* .o文件? At that point my C code is now in object/machine/byte code and can be interpreted by the CPU directly. 此时,我的C代码现在是对象/机器/字节代码,可以由CPU直接解释。 Yet when I make my *.o file executable and try to run it I get this error: 然而,当我使* .o文件可执行并尝试运行它时,我收到此错误:

bash: ./helloworld.o: cannot execute binary file: Exec format error

Why do I get this error? 为什么我会收到此错误? If I have a tiny C program (for example a hello world program) with only one C file it would appear to me that linking has no purpose because there's nothing to link. 如果我有一个只有一个C文件的小C程序(例如一个hello world程序),那么在我看来,链接没有任何意义,因为没有任何东西可以链接。 So what does linking in the compilation process actually do? 那么编译过程中的链接实际上做了什么呢?

Thanks in advance for any replies. 提前感谢您的回复。

If I have a tiny C program (for example a hello world program) 如果我有一个小C程序(例如一个hello world程序)

Even your helloworld program does use #inlude<stdio.h> , doesn't it? 甚至你的helloworld程序都使用#inlude<stdio.h> ,不是吗? That means you're using a library, and the linking step is there to combine the necessary object code (here the library code) to create a binary for you. 这意味着您正在使用库,并且链接步骤用于组合必要的目标代码(此处为库代码)以为您创建二进制文件。


For a detailed descriptions of what the linking step does (and compare with compiling) - see this question 有关链接步骤的详细描述(并与编译比较) - 请参阅此问题

Linking in rough explanation is: 粗略解释的链接是:

  • Find all the matching segments from each object file, and concat them together. 从每个目标文件中查找所有匹配的段,并将它们连接在一起。 This way we end up with one large .code, one .data, one .bss etc. 这样我们最终得到一个大的.code,一个.data,一个.bss等。
  • Resolve all symbols that are used. 解析所有使用的符号。 Many symbols are local, so that they can be resolved immediately. 许多符号是本地的,因此可以立即解决。 Unresolved symbols will be searched for in the libraries requested to link with. 将在请求链接的库中搜索未解析的符号。 When this is done, the result will be a symbol table / link map. 完成后,结果将是符号表/链接映射。
  • Make an file that is actually executable. 创建一个实际可执行的文件。 On Linux, it usually just happens that both executable, libraries and object files all are in the ELF format. 在Linux上,通常只会发生可执行文件,库和目标文件都是ELF格式。 This is not true for all platforms. 并非所有平台都适用。

The simple answer is that .o executables serve different purposes and have a different format. 简单的答案是.o可执行文件用于不同的目的并具有不同的格式。

If you want the complete answer you will need to read the necessary documentation for your platforms binary format. 如果您需要完整的答案,则需要阅读平台二进制格式的必要文档。

On linux this will be here . 在Linux上,这将是在这里 This document will describe the difference between the intermediate format and the final executable format. 本文档将描述中间格式与最终可执行格式之间的区别。

Just as an aside the linux kernel module loader does use .o (or rather .ko) files directly. 同样,Linux内核模块加载器确实直接使用.o(或更确切地说.ko)文件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM