简体   繁体   English

为什么动态链接的可执行文件明显慢于Linux中的静态链接?

[英]Why is a dynamically linked executable noticeably slower than the statically linked one in Linux?

Testing with a toy program that determines the result of a tic-tac-toe board, I got this. 用玩具程序测试确定井字棋盘的结果,我得到了这个。 What's making this quite big difference? 是什么让这个差别很大? I'd suspect that the calls to rand is faster with a statically linked libc, but still surprised with the result. 我怀疑使用静态链接的libc对rand的调用更快,但仍然对结果感到惊讶。

~$ gcc a.c -std=c11 -O3
~$ time ./a.out
32614644

real    0m9.396s
user    0m9.388s
sys     0m0.004s

~$ gcc a.c -std=c11 -O3 -static
~$ time ./a.out
32614644

real    0m6.891s
user    0m6.884s
sys     0m0.000s

#include <stdio.h>
#include <stdlib.h>

#define SIZE 3
#define SIZE_2 (SIZE * SIZE)

static int determineResult(int board[static SIZE_2]) {
  for (int i = 0; i < SIZE_2; i += SIZE) {
    if (!board[i]) {
      continue;
    }
    for (int j = i + 1; j < i + SIZE; ++j) {
      if (board[i] != board[j]) {
        goto next;
      }
    }
    return board[i];
  next:;
  }
  for (int i = 0; i < SIZE; ++i) {
    if (!board[i]) {
      continue;
    }
    for (int j = i + SIZE; j < i + SIZE_2; j += SIZE) {
      if (board[i] != board[j]) {
        goto next2;
      }
    }
    return board[i];
  next2:;
  }
  for (int i = SIZE + 1; i < SIZE_2; i += SIZE + 1) {
    if (board[i] != *board) {
      goto next3;
    }
  }
  return *board;
next3:
  for (int i = SIZE * 2 - 2; i <= SIZE_2 - SIZE; i += SIZE - 1) {
    if (board[i] != board[SIZE - 1]) {
      return 0;
    }
  }
  return board[SIZE - 1];
}

#define N 50000000

int main(void) {
  srand(0);
  size_t n = 0;
  for (int i = 0; i < N; ++i) {
    int board[SIZE_2];
    for (int i = 0; i < SIZE_2; ++i) {
      board[i] = rand() % 3;
    }
    n += determineResult(board);
  }
  printf("%zu\n", n);
  return EXIT_SUCCESS;
}

I can't be sure this is the reason without knowing the particular ABI (which depends on OS and cpu architecture) your system is using, but the following is the most likely explanation. 我不能确定这是不知道您的系统正在使用的特定ABI(这取决于操作系统和CPU架构)的原因,但以下是最可能的解释。

On most implementations, code in shared libraries (including shared libc.so ) has to be position-independent code . 在大多数实现中,共享库中的代码(包括共享libc.so )必须是与位置无关的代码 This means it can be loaded at any address (rather than assigned a fixed run-time address by the linker) and thus cannot use hard-coded absolute data addresses in the machine code. 这意味着它可以在任何地址加载(而不是由链接器分配固定的运行时地址),因此不能在机器代码中使用硬编码的绝对数据地址。 Instead, it has to access global data via either instruction-pointer-relative addressing or a global offset table (GOT) whose address is either kept in a register or computed relative to the instruction pointer. 相反,它必须通过指令指针相对寻址或全局偏移表 (GOT)访问全局数据,其地址保存在寄存器中或相对于指令指针计算。 These addressing modes are efficient mainly on well-designed modern instruction set architectures like x86_64, AArch64, RISC-V, etc. On most other architectures, including 32-bit x86, they're quite inefficient. 这些寻址模式主要在精心设计的现代指令集架构(如x86_64,AArch64,RISC-V等)上高效。在大多数其他架构(包括32位x86)上,它们的效率非常低。 For example, the following function: 例如,以下功能:

int x;
int get_x()
{
    return x;
}

will balloon into something like the following on x86: 会在x86上冒充如下内容:

get_x:
    push %ebp
    mov %esp, %ebp
    push %ebx
    sub $4, %esp
    call __x86.get_pc_thunk_bx
    add $_GLOBAL_OFFSET_TABLE_, %ebx
    mov x@GOT(%ebx), %eax
    mov (%eax),%eax
    add $4, %esp
    pop %ebx
    pop %ebp
    ret

whereas you would expect (for non-position-independent-code) to see: 而你会期望(对于非位置无关的代码)看到:

get_x:
    mov x, %eax
    ret

Being that random number generators have internal (global) state, they're stuck doing this expensive dance for position-independent code. 由于随机数生成器具有内部(全局)状态,因此他们无法为与位置无关的代码执行这种昂贵的舞蹈。 And being that the actual computation they do is likely very short and fast, the PIC overhead is probably a significant part of their run time. 由于他们所做的实际计算可能非常短且快,因此PIC开销可能是其运行时间的重要部分。

One way to confirm this theory would be to try using rand_r or random_r instead. 确认这一理论的一种方法是尝试使用rand_rrandom_r These functions use caller-provided state and thus can (at least in theory) avoid any internal access to global data. 这些函数使用调用者提供的状态,因此(至少在理论上)可以避免对全局数据的任何内部访问。

The problem here is that you are comparing the total execution time, in a small example like this it will be greatly superior for the static linking example since there are no lookups to be done. 这里的问题是你正在比较总的执行时间,在这样的小例子中,它将非常优于静态链接示例,因为没有要执行的查找。

The big difference between static and dynamic linking is that dynamic linking has several modules/objects which are linked together at runtime and that statically compiled binaries have everything contained within the binary. 静态链接和动态链接之间的最大区别在于,动态链接具有在运行时链接在一起的多个模块/对象,并且静态编译的二进制文件包含二进制文件中包含的所有内容。 There are some specifics that can vary of course, but that's roughly it. 当然有些细节可能会有所不同,但大致就是这样。

So... taken the above into consideration, starting an executable, loading a few different files, executing your function and returning is probably going to take more time than load the file and execute your function. 所以......考虑到上述因素,启动可执行文件,加载几个不同的文件,执行函数并返回可能比加载文件和执行函数需要更多的时间。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为什么可执行文件小于与应用程序项目静态链接的库? - why executable is smaller than the library which is statically linked with application project? 将动态链接库合并为一个静态链接库 - Combine Dynamically Linked Libraries into one Statically Linked Library 将静态链接的 elf 二进制文件转换为动态链接 - Convert a statically linked elf binary to dynamically linked 为什么Linux上的动态链接的可执行文件在其自己的内存空间中具有libc的完整内存空间? - Why does a dynamically linked executable on Linux have the complete memory space of libc in its own memory space? 从可执行文件中提取静态链接库 - Extract statically linked libraries from an executable GDB 可以使用静态链接库重新加载可执行文件吗? - Can GDB reload executable with a statically linked library? Linux上的C ++,确认库中的代码未静态链接到生成的可执行文件中 - C++ on Linux, confirm code from library is not being statically linked into resulting executable 与stdio.h对应的库文件是动态链接还是静态链接 - Are library files corresponding to stdio.h dynamically linked or statically linked libc和crt1.o是静态链接还是动态链接? - is `libc` and `crt1.o` statically or dynamically linked? 编译“stress-ng”包的静态链接可执行文件 - Compile a statically linked executable of the “stress-ng” package
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM