简体   繁体   English

-O0 与 -O3 中 __rdtsc() 的汇编代码

[英]Assembly code for __rdtsc() in -O0 vs -O3

I have the following code:我有以下代码:

#include <x86intrin.h>

int main() {
    return __rdtsc();
}

And I tried to compile on my machine (Intel i7-6700 CPU) and objdump我试图在我的机器(Intel i7-6700 CPU)和objdump上编译

g++ -Wall test_tsc.cpp -o test_tsc -march=native -mtune=native -O0 -std=c++20
objdump -M intel -d test_tsc > test_tsc.O0

Then in test_tsc.O0 :然后在test_tsc.O0

0000000000401122 <main>:
  401122:   55                      push   rbp
  401123:   48 89 e5                mov    rbp,rsp
  401126:   0f 31                   rdtsc  
  401128:   48 c1 e2 20             shl    rdx,0x20
  40112c:   48 09 d0                or     rax,rdx
  40112f:   90                      nop
  401130:   5d                      pop    rbp
  401131:   c3                      ret    
  401132:   66 2e 0f 1f 84 00 00    nop    WORD PTR cs:[rax+rax*1+0x0]
  401139:   00 00 00 
  40113c:   0f 1f 40 00             nop    DWORD PTR [rax+0x0]

What do push rbp and mov rbp,rsp do? push rbpmov rbp,rsp有什么作用? It seems like they were for saving the stack pointer but then there isn't really a function call.看起来他们是为了保存堆栈指针,但实际上并没有 function 调用。 If g++ consider __rdtsc() a function call, then would there be something like call afterward?如果 g++ 考虑__rdtsc()一个 function 调用,那么之后会有类似call的东西吗?

Thanks.谢谢。

rbp is the base pointer, not the stack pointer. rbp是基指针,而不是堆栈指针。 The base pointer is used for backtrace during debugging but it is not necessary for actually running.基指针用于调试期间的回溯,但实际运行时不需要。

It is preserved through function calls so with -O3 only the expected assembly is generated:它通过 function 调用保留,因此使用-O3仅生成预期的程序集:

main:
        rdtsc
        salq    $32, %rdx
        orq     %rdx, %rax
        ret

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为什么 GCC 删除了我在 O3 上的代码,而不是在 O0 上? - Why does GCC delete my code on O3 but not on O0? cmake Xcode 发生器在释放模式下使用 -O0 代替 -O3 或 -O2 - cmake Xcode generator uses -O0 in release mode instead -O3 or -O2 GCC中带有-O3的共享库编译不会导出与-O0中一样多的符号 - Shared library compilation in GCC with -O3 don't export as much symbols as in -O0 具有不匹配的优化级别(-O3、-O2、-O1、-O0)的二进制文件是否会导致稳定性问题? - Does having binaries with mismatched optimization levels (-O3, -O2, -O1, -O0) cause stability issues? gcc -O0在2的幂(矩阵换位)矩阵大小上表现优于-O3 - gcc -O0 outperforming -O3 on matrix sizes that are powers of 2 (matrix transpositions) 为什么在-O0而不是-O3中for循环中的访问比在有范围中的访问快? - Why access in a for loop is faster than access in a ranged-for in -O0 but not in -O3? ICC中的-O3扰乱内在函数,使用-O1或-O2或相应的手动装配 - -O3 in ICC messes up intrinsics, fine with -O1 or -O2 or with corresponding manual assembly SEGFAULT在-O3模式下? - SEGFAULT in -O3 mode? Clion 使用 -O3 编译 - Clion compile with -O3 gcc -O0与-Og编译时间 - gcc -O0 vs. -Og compilation time
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM