简体   繁体   English

gcc -O3 优化:: xmm0 寄存器?

[英]gcc -O3 optimize :: xmm0 register?

I was writing a vsprintf function to use my 64-bit OS kernel (written by C), and checked that it works well in Visual Studio and Cygwin gcc.我正在编写一个 vsprintf 函数来使用我的 64 位操作系统内核(由 C 编写),并检查它是否在 Visual Studio 和 Cygwin gcc 中运行良好。 Then, I put to my kernel and run... but kernel doesn't works well然后,我放入内核并运行...但内核不能正常工作

I debugged and figured out the problem: vsprintf contains next assembly code我调试并找出了问题:vsprintf 包含下一个汇编代码

movdqa xmm0,XMMWORD PTR [rip+0x0]

The real problem is that I NEVER use floating point!真正的问题是我从不使用浮点!

I guess that was gcc's optimization, and It seems to be correct because It works well without optimization.我猜那是 gcc 的优化,它似乎是正确的,因为它在没有优化的情况下运行良好。

Is there any solution, so to speak, gcc option that disable optimization with xmm registers?是否有任何解决方案,可以这么说,禁用 xmm 寄存器优化的 gcc 选项?

The XMM register move instructions are generated, because in the System V AMD64 ABI , floating point arguments are stored in XMM0–XMM7.生成 XMM 寄存器移动指令,因为在System V AMD64 ABI 中,浮点参数存储在 XMM0–XMM7 中。

Since we don't know if floating points are used just by looking at the variadic function, the compiler needs to generate instructions to push the floating point values to the va_list as well.由于仅通过查看可变参数函数我们不知道是否使用了浮点数,因此编译器还需要生成指令将浮点值推送到va_list中。


You could use the -mno-sse flag to disable SSE .您可以使用-mno-sse标志禁用 SSE For example,例如,

__attribute__((noinline))
void f(const char* x, ...) {
    va_list va;
    va_start(va, x);
    vprintf(x, va);
    va_end(va);
}

Without the -mno-sse flag:没有-mno-sse标志:

subq    $0x000000d8,%rsp
testb   %al,%al
movq    %rsi,0x28(%rsp)
movq    %rdx,0x30(%rsp)
movq    %rcx,0x38(%rsp)
movq    %r8,0x40(%rsp)
movq    %r9,0x48(%rsp)
je  0x100000f1b
movaps  %xmm0,0x50(%rsp)
movaps  %xmm1,0x60(%rsp)
movaps  %xmm2,0x70(%rsp)
movaps  %xmm3,0x00000080(%rsp)
movaps  %xmm4,0x00000090(%rsp)
movaps  %xmm5,0x000000a0(%rsp)
movaps  %xmm6,0x000000b0(%rsp)
movaps  %xmm7,0x000000c0(%rsp)
0x100000f1b:
leaq    0x000000e0(%rsp),%rax
movl    $0x00000008,0x08(%rsp)
movq    %rax,0x10(%rsp)
leaq    0x08(%rsp),%rsi
leaq    0x20(%rsp),%rax
movl    $0x00000030,0x0c(%rsp)
movq    %rax,0x18(%rsp)
callq   0x100000f6a ; symbol stub for: _vprintf
addq    $0x000000d8,%rsp
ret

With the -mno-sse flag:使用-mno-sse标志:

subq    $0x58,%rsp
leaq    0x60(%rsp),%rax
movq    %rsi,0x28(%rsp)
movq    %rax,0x10(%rsp)
leaq    0x08(%rsp),%rsi
leaq    0x20(%rsp),%rax
movq    %rdx,0x30(%rsp)
movq    %rcx,0x38(%rsp)
movq    %r8,0x40(%rsp)
movq    %r9,0x48(%rsp)
movl    $0x00000008,0x08(%rsp)
movq    %rax,0x18(%rsp)
callq   0x100000f6a ; symbol stub for: _vprintf
addq    $0x58,%rsp
ret

You could also use the target attribute to disable SSE just for that function, eg您还可以使用target属性仅为该功能禁用 SSE,例如

__attribute__((noinline, target("no-sse")))
//                       ^^^^^^^^^^^^^^^^
void f(const char* x, ...) {
    va_list va;
    va_start(va, x);
    vprintf(x, va);
    va_end(va);
}

But be warned that other functions with SSE support won't know f doesn't use SSE , and thus calling them with floating point numbers will cause undefined behavior :但请注意,具有 SSE 支持的其他函数不会知道f不使用 SSE ,因此使用浮点数调用它们将导致未定义的行为

int main() {
    f("%g %g", 1.0, 2.0);  // 1.0 and 2.0 are stored in XMM0–1
                           // So this will print garbage e.g. `0 6.95326e-310`
}

使用 -O2 而不是 -O3 它将起作用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM