为什么我可以使用 XCode 的 llvm 与 MacPorts 的 clang++ 获得更快的二进制文件？

Question

I have written a benchmark method to test my C++ program (which searches a game tree), and I am noticing that compiling with the "LLVM compiler 2.0" option in XCode 4.0.2 gives me a significantly faster binary than if I compile with the latest version of clang++ from MacPorts.我已经编写了一个基准方法来测试我的 C++ 程序（它搜索游戏树），并且我注意到使用 XCode 4.0.2 中的“LLVM 编译器 2.0”选项进行编译比使用来自 MacPorts 的最新版本的 clang++。

If I understand correctly I am using a clang front-end and llvm back-end in both cases.如果我理解正确，我在这两种情况下都使用 clang 前端和 llvm 后端。 Has Apple made improvements to their clang/llvm distribution to produce faster binaries for Mac OS? Apple 是否对他们的 clang/llvm 发行版进行了改进，以便为 Mac OS 生成更快的二进制文件？ I can't find much information about the project.我找不到有关该项目的太多信息。

Here are the benchmarks my program produces for various compilers, all using -O3 optimization (higher is better):以下是我的程序为各种编译器生成的基准测试，全部使用 -O3 优化（越高越好）：

(Xcode) "gcc 4.2": 38.7
(Xcode) "llvm gcc 4.2": 51.2
(Xcode) "llvm compiler 2.0": 50.6
g++-mp-4.6: 43.4
clang++: 40.6

Also, how do I compile with the clang/llvm XCode is using from the terminal?另外，如何使用终端使用的 clang/llvm XCode 进行编译？ I can't find the command.我找不到命令。

EDIT: The scores I output are "thousands of games per second" which are calculated over a long enough run of the program.编辑：我 output 的分数是“每秒数千场比赛”，这是在足够长的程序运行期间计算的。 Scores are very consistent over multiple runs, and recent major algorithmic improvements have given me 1% - 5% speed ups, for example.多次运行的分数非常一致，例如，最近的主要算法改进给了我 1% - 5% 的加速。 A 25% speed up of 40 to 50 is huge for my program. 40 到 50 的 25% 加速对我的程序来说是巨大的。

UPDATE: I wasn't invoking clang++ from the command line with -flto.更新：我没有使用 -flto 从命令行调用 clang++。 Now when I compare clang++ -O3 -flto to /Developer/usr/bin/clang++ -O3 -flto from the command line the results are closer, but the Apple one is still 6.5% faster.现在，当我从命令行将 clang++ -O3 -flto 与 /Developer/usr/bin/clang++ -O3 -flto 进行比较时，结果更接近，但 Apple 的速度仍然快 6.5%。

Now how to enable link time optimization for gcc?现在如何为 gcc 启用链接时间优化？ When I try g++ -flto I get the following error:当我尝试 g++ -flto 时，出现以下错误：

cc1plus: error: LTO support has not been enabled in this configuration

Answer 1

Apple LLVM Compiler should be available under /Developer/usr/bin/clang. Apple LLVM 编译器应该在 /Developer/usr/bin/clang 下可用。

I can't think of any particular reason why MacPorts clang++ would generate slower code... I would check whether you're passing in comparable command-line options.我想不出 MacPorts clang++ 会生成较慢代码的任何特殊原因......我会检查您是否传递了类似的命令行选项。 One thing that would make a large difference is if you're producing 32-bit code with one compiler, and 64-bit code with the other.会产生很大影响的一件事是，如果您使用一个编译器生成 32 位代码，而使用另一个编译器生成 64 位代码。

Answer 2

If GCC has no LTO then you need to build it yourself:如果 GCC 没有 LTO，那么您需要自己构建它：

http://solarianprogrammer.com/2012/07/21/compiling-gcc-4-7-1-mac-osx-lion/ http://solarianprogrammer.com/2012/07/21/compiling-gcc-4-7-1-mac-osx-lion/

For LTO you need to add 'libelf' to the instructions.对于 LTO，您需要在说明中添加“libelf”。

http://sourceforge.net/apps/trac/mingw-w64/wiki/LTO%20and%20GCC http://sourceforge.net/apps/trac/mingw-w64/wiki/LTO%20and%20GCC

Answer 3

Exact speed of an algorithm can depend on all kinds of things that are totally out of your's and the compiler's power.算法的确切速度可能取决于完全超出您和编译器能力的各种事物。 You may have a loop where the execution time depends on precisely how the instructions are aligned in memory, in a way that the compiler couldn't predict.您可能有一个循环，其中执行时间精确地取决于指令在 memory 中的对齐方式，编译器无法预测。 I have seen cases where a loop could enter different "states" with different execution times per iteration (so after a context switch, it could enter a state where it took either 12 or 13 cycles, rather randomly).我见过这样的情况，循环可以在每次迭代中以不同的执行时间进入不同的“状态”（因此在上下文切换之后，它可以进入一个 state，它需要 12 或 13 个周期，而不是随机的）。 This can all be coincidence.这一切都可能是巧合。

And you might be using different libraries, which is quite possible the reason.而且您可能正在使用不同的库，这很可能是原因。 In MacOS X, they are using a new and presumably faster implementation of std::string and std::vector, for example.例如，在 MacOS X 中，他们正在使用一种新的、可能更快的 std::string 和 std::vector 实现。

为什么我可以使用 XCode 的 llvm 与 MacPorts 的 clang++ 获得更快的二进制文件？

问题描述

3 个解决方案

解决方案1
2 2011-06-16 03:48:17

解决方案2
0 2013-02-15 11:23:46

解决方案3
0 2014-03-14 14:23:32

为什么我可以使用 XCode 的 llvm 与 MacPorts 的 clang++ 获得更快的二进制文件？

问题描述

3 个解决方案

解决方案1 2 2011-06-16 03:48:17

解决方案2 0 2013-02-15 11:23:46

解决方案3 0 2014-03-14 14:23:32

解决方案1
2 2011-06-16 03:48:17

解决方案2
0 2013-02-15 11:23:46

解决方案3
0 2014-03-14 14:23:32