简体繁体 English

减少cpp翻译单元的数量是一个好主意吗？

[英]Is reducing number of cpp translation units a good idea?

原文 2009-05-14 05:35:38 3 8 c++/ performance/ build-process/ module/ header-files

I find that if there are a lot of classes the compilation time is dramatically increased when I use one *.h and one *.cpp file per class. 我发现如果有很多类，当我每个类使用一个* .h和一个* .cpp文件时，编译时间会大大增加。 I already use precompiled headers and incremental linking, but still the compile time is very long (yes I use boost ;) 我已经使用预编译头和增量链接，但编译时间仍然很长（是的，我使用boost;）

So I came up with the following trick: 所以我想出了以下技巧：

defined *.cpp files as non-compilable 将* .cpp文件定义为不可编译
defined *.cxx files as compilable 将* .cxx文件定义为可编译
added one *.cxx file per application module, and #included all the *.cpp files of this module in it. 每个应用程序模块添加一个* .cxx文件，＃include此模块的所有* .cpp文件。

So instead of 100+ translation units I ended up with only 8 translation units. 因此，我只有8个翻译单元，而不是100多个翻译单元。 The compile time became 4-5 times shorter. 编译时间缩短了4-5倍。

The downsides are that you have to manually include all the *.cpp files (but it's not really a maintenance nightmare since if you forget to include something the linker will remind you), and that some VS IDE conveniences are not working with this scheme, eg Go To/ Move to Implementation etc. 缺点是您必须手动包含所有* .cpp文件（但它不是真正的维护噩梦，因为如果您忘记包含链接器将提醒您的内容），并且某些VS IDE便利不适用于此方案，例如转到/转移到实施等

So the question is, is having lots of cpp translation units really the only true way? 所以问题是，有很多cpp翻译单元真的是唯一真正的方法吗？ Is my trick a known pattern, or maybe I'm missing something? 我的伎俩是一种已知的模式，还是我错过了什么？ Thanks! 谢谢！

8 个解决方案

One significant drawback of this approach is caused by having one .obj file for each translation unit. 这种方法的一个显着缺点是由于每个翻译单元有一个.obj文件。

If you create a static library for reuse in other projects you will often face bigger binaries in those projects if you have several huge translation units instead of many small ones because the linker will only include the .obj files containing the functions/variables really referenced from within the project using the library. 如果您创建一个静态库以便在其他项目中重用，那么如果您有几个巨大的翻译单元而不是许多小翻译单元，那么您将经常在这些项目中面对更大的二进制文件，因为链接器将只包含.obj文件，其中包含真正引用的函数/变量在使用该库的项目中。

In case of big translation units it's more likely that each unit is referenced and the corresponding .obj file is included. 在大型翻译单元的情况下，更有可能引用每个单元并包含相应的.obj文件。 Bigger binaries may be a problem in some cases. 在某些情况下，更大的二进制文件可能是一个问题。 It's also possible that some linkers are smart enough to only include the necessary functions/variables, not the whole .obj files. 一些链接器也足够聪明，只能包含必要的函数/变量，而不是整个.obj文件。

Also if the .obj file is included and all the global variables are included too then their constructors/destructors will be called when the program is started/stopped which surely will take time. 此外，如果包含.obj文件并且包含所有全局变量，则在程序启动/停止时将调用它们的构造函数/析构函数，这肯定需要时间。

I've seen what you do in video games since it helps the compiler to do optimizations it otherwise couldn't do as well as save a lot of memory. 我已经看到你在视频游戏中做了什么，因为它有助于编译器进行优化，否则无法做到以及节省大量内存。 I've seen "uber build" and "bulk build" refer to this idea. 我见过“超级构建”和“批量构建”是指这个想法。 And if it helps speed up your build, why not.. 如果它有助于加速你的构建，为什么不..

Bundling a larger number of C++ source code files into a single file is an approach that has been mentioned a few times recently, especially when people were building large systems and pulling in complicated header files (that'll be boost, then). 将大量C ++源代码文件捆绑到一个文件中是最近几次提到过的一种方法，特别是当人们构建大型系统并引入复杂的头文件时（那将是提升）。

As you mention VS, I found that the number of include files in a project and especially the size of the include path seems to affect Visual C++'s compilation times far more than it does g++'s compilation times. 当你提到VS时，我发现项目中包含文件的数量，特别是include路径的大小，似乎影响了Visual C ++的编译时间，远远超过了g ++的编译时间。 This is especially the case with lots of nested includes (again, boost does that) as a large number of file searches are necessary to find all include files required by the source code. 特别是大量嵌套包含的情况（再次，boost会这样做），因为需要大量的文件搜索才能找到源代码所需的所有包含文件。 Combining the code into a single source file means that the compiler can be much smarter about finding said includes, plus there are obviously fewer of them to be found as you would expect that the files in the same subproject would be likely to include a very similar set of header files. 将代码组合到单个源文件中意味着编译器可以更加智能地查找所述包含，并且显然更少找到它们，因为您可能期望同一子项目中的文件可能包含非常相似头文件集。

The "lots of compilation units" approach to C++ development usually comes from a desire to decouple classes and minimise dependencies between classes so the compiler only has to rebuild the minimal set of files in case you make any changes. C ++开发的“大量编译单元”方法通常来自于要求解耦类和最小化类之间的依赖关系，因此编译器只需重建最小的文件集，以防您进行任何更改。 This is generally a good approach but often not really feasible in a subproject simply because the files in there have dependencies on each other so you'll end up with quite large rebuilds anyway. 这通常是一种很好的方法，但在子项目中通常不太可行，因为那里的文件之间存在依赖关系，所以无论如何你最终都会进行大规模的重建。

I don't think that reduction of number of compilation units is a good idea. 我不认为减少编译单元的数量是一个好主意。 Your are trying to solve a problem with big compilation time, and this approach seems to help with it, but what you get in addition: 你正试图用大的编译时间解决问题，这种方法似乎有助于它，但你得到的另外：

Increased compilation time during development. 在开发期间增加编译时间。 Usually developer modify few files at a time, and compilation will be probably faster for 3-4 small files then for one very big file. 通常开发人员一次修改几个文件，对于3-4个小文件然后对于一个非常大的文件，编译可能会更快。
As you mentioned, harder to navigate code, IMHO this is extremely important. 正如你所提到的，更难以导航代码，恕我直言这是非常重要的。
You can have some interference between .cpp files included into one .cxx file: 您可以在一个.cxx文件中包含的.cpp文件之间产生一些干扰：
a. 一个。 It is common practice to define locally in cpp file (for debug builds) macro new for memory leak check. 通常的做法是在cpp文件中本地定义（用于调试版本）宏新的内存泄漏检查。 Unfortunately this cannot be done before including headers using placement new (as some STL and BOOST header do) 不幸的是，在使用placement new包含头文件之前无法做到这一点（如某些STL和BOOST头部那样）
b. 湾 It is common practice to add using declarations in cpp files. 通常的做法是在cpp文件中添加声明。 With your approach this may lead to problems with headers, included later 使用您的方法，这可能会导致标题出现问题，稍后会包含在内
c. C。 Name conflicts are more probable 名称冲突更可能发生

IMHO, much cleaner (but maybe more expensive way) to speed up compilation time is to use some distributed build system. 恕我直言，更加清洁（但可能更昂贵的方式）加快编译时间是使用一些分布式构建系统。 They are especially effective for clean builds. 它们对于清洁构建尤其有效。

I'm not sure if this is relevant in your case but maybe you can use declaration instead definition to reduce the number of #include 's that you have to do. 我不确定这是否与您的情况相关，但也许您可以使用声明而不是定义来减少您必须执行的#include的数量。 Also, maybe you can use the pimpl idiom for the same purpose. 此外，也许你可以使用pimpl习语来达到同样的目的。 That would hopefully reduce the number of source files that need to be recompiled each time and the number of headers that have to be pulled in. 这有望减少每次需要重新编译的源文件数量以及必须提取的标头数量。

Bigger and fewer translation units do not take advantage of parallel compilation. 更大和更少的翻译单元不利用并行编译。 I don't know what compilers and what platforms you are using, but compiling in parallel multiple translation units might decrease significantly the building time... 我不知道你使用的编译器和平台是什么，但并行编译多个翻译单元可能会大大减少构建时间......

这个概念被称为统一构建

Following on from sharptooths post, I'd tend to examine the resultant executables in some detail. 从sharptooths帖子开始，我倾向于详细检查结果可执行文件。 If they are different, I'd tend to limit your technique to debug builds and resort to the original project config for the main release build. 如果它们不同，我倾向于限制您的技术来调试构建并使用主要版本构建的原始项目配置。 When checking the executable, I'd also look at its memory footprint and resource usage at startup and while running. 检查可执行文件时，我还会查看启动时和运行时的内存占用和资源使用情况。