简体   繁体   English

GCC 和 g++ 是如何引导的?

[英]How are GCC and g++ bootstrapped?

This has been bugging me for a while.这已经困扰我一段时间了。 How do GCC and g++ compile themselves? GCC 和 g++ 如何自己编译?

I'm guessing that every revision gets compiled with a previously built revision.我猜每个修订版都是用以前构建的修订版编译的。 Is this true?这是真的? And if it is, does it mean that the oldest g++ and GCC versions were written in assembly?如果是,这是否意味着最旧的 g++ 和 GCC 版本是用汇编编写的?

The oldest version of GCC was compiled using another C compiler, since there were others when it was written.最旧的 GCC 版本是使用另一个 C 编译器编译的,因为它在编写时还有其他版本。 The very first C compiler ever (ca. 1973, IIRC) was implemented either in PDP-11 assembly, or in the B programming language which preceded it, but in any case the B compiler was written in assembly.有史以来的第一个 C 编译器(大约 1973 年,IIRC)是在PDP-11汇编中实现的,或者在它之前的 B 编程语言中实现的,但无论如何,B 编译器是用汇编编写的。 Similarly, the first ever C++ compiler (CPre/ Cfront , 1979-1983) were probably first implemented in C, then rewritten in C++.类似地,第一个 C++ 编译器(CPre/ Cfront ,1979-1983)可能首先用 C 实现,然后用 C++ 重写。

When you compile GCC or any other self-hosting compiler, the full order of building is:当您编译 GCC 或任何其他自托管编译器时,完整的构建顺序是:

  1. Build new version of GCC with existing C compiler使用现有的 C 编译器构建新版本的 GCC
  2. re-build new version of GCC with the one you just built用您刚刚构建的 GCC 重新构建新版本的 GCC
  3. (optional) repeat step 2 for verification purposes. (可选)重复步骤 2 以进行验证。

This process is called bootstrapping .这个过程称为引导 It tests the compiler's capability of compiling itself and makes sure that the resulting compiler is built with all the optimizations that it itself implements.它测试编译器自身编译的能力,并确保生成的编译器是使用它自己实现的所有优化构建的。

EDIT : Drew Dormann, in the comments, points to Bjarne Stroustrup's account of the earliest implementation of C++ .编辑:Drew Dormann 在评论中指出 Bjarne Stroustrup对 C++ 最早实现的描述 It was implemented in C++ but translated by what Stroustrup calls a "preprocessor" from C++ to C;它是用 C++ 实现的,但被 Stroustrup 称为从 C++ 到 C 的“预处理器”翻译; not a full compiler by his definition, but still C++ was bootstrapped in C.根据他的定义,这不是一个完整的编译器,但 C++ 仍然是用 C 引导的。

If you want to replicate the bootstrap process of GCC in a modern environment (x86 Linux), you can use the tools developed by the bootstrappable project:如果要在现代环境(x86 Linux)中复制 GCC 的 bootstrap 过程,可以使用bootstrappable项目开发的工具:

  • We can start with hex0 assembler (on x86 it's 357 byte binary) which does roughly what the following two commands do我们可以从hex0汇编器开始(在 x86 上,它是 357 字节的二进制文件),它大致完成以下两个命令的作用

    sed 's/[;#].*$//g' hex0_x86.hex0 | xxd -r -p > hex0 chmod +x hex0

    Ie it translates ASCII equivalent of binary program into binary code, but it is written in hex0 itself.即它将二进制程序的 ASCII 等价物转换为二进制代码,但它本身是用 hex0 编写的。

    Basically, hex0 has equivalent source code that is in one to one correspondence to its binary code.基本上,hex0 具有与其二进制代码一一对应的等效源代码。

  • hex0 can be used to build a slighly more powerful hex1 assembler that supports a few more features (one character labels and calculates offsets). hex0可用于构建一个稍微更强大的hex1汇编程序,它支持更多功能(一个字符标签和计算偏移量)。 hex1 is written in hex0 assembly. hex1 是用 hex0 程序集编写的。

  • hex1 can be used to build hex2 (even more advanced assembler that supports multi character labels). hex1可用于构建hex2 (支持多字符标签的更高级的汇编程序)。

  • hex2 then can be used to build a macro assembler (where program using macros instead of hex opcodes).然后hex2可用于构建宏汇编程序(其中程序使用宏而不是十六进制操作码)。

  • You can then use thismacro assembler to build cc_x86 which is a "C compiler" written in assembly.然后,您可以使用这个宏汇编程序来构建cc_x86 ,它是一个用汇编语言编写的“C 编译器”。 cc_x86 only supports a small subset of C but that's an impresive start. cc_x86 仅支持 C 的一小部分,但这是一个令人印象深刻的开始。

  • You can use cc_x86 to build M2-Planet (Macro Platform Neutral Transpiler) which is a C compiler written in C. M2-Planet is self hosting and can build itself.您可以使用cc_x86构建M2-Planet (宏平台中立转译器) ,这是一个用 C 编写的 C 编译器。M2-Planet 是自托管的,可以自行构建。

  • You can then use M2-Planet to build GNU Mes which is a small scheme interpreter.然后您可以使用 M2-Planet 来构建GNU Mes ,它是一个小型方案解释器。

  • mes can be used to run mescc which is a C compiler written in scheme and lives in the same repository as mes. mes 可用于运行 mescc,它是一个用 scheme 编写的 C 编译器,与 mes 位于同一存储库中。

  • mescc can be used to rebuild mes and also build mes C library. mescc 可用于重建 mes 并构建 mes C 库。

  • Then mescc can be used to build a slighly patched Tiny C compiler .然后 mescc 可以用来构建一个稍微打了补丁的 Tiny C 编译器

  • Then you can use it to build newer version of TCC 0.9.27.然后您可以使用它来构建更新版本的 TCC 0.9.27。

  • GCC 4.0.4 and musl C library can be built with TCC 0.9.27. GCC 4.0.4 和 musl C 库可以用 TCC 0.9.27 构建。

  • Then you can build newer GCC using older GCC.然后你可以使用旧的 GCC 构建新的 GCC。 Eg GCC 4.0.4 -> GCC 4.7.4 -> modern GCC.例如 GCC 4.0.4 -> GCC 4.7.4 -> 现代 GCC。

TL;DR:特尔;博士:

hex0 -> hex1 -> hex2 -> M0 -> M2-Planet -> Mes -> Mescc -> TCC -> GCC. hex0 -> hex1 -> hex2 -> M0 -> M2-Planet -> Mes -> Mescc -> TCC -> GCC。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM