[英]gcc -O3 optimize :: xmm0 register?
I was writing a vsprintf function to use my 64-bit OS kernel (written by C), and checked that it works well in Visual Studio and Cygwin gcc.我正在编写一个 vsprintf 函数来使用我的 64 位操作系统内核(由 C 编写),并检查它是否在 Visual Studio 和 Cygwin gcc 中运行良好。 Then, I put to my kernel and run... but kernel doesn't works well
然后,我放入内核并运行...但内核不能正常工作
I debugged and figured out the problem: vsprintf contains next assembly code我调试并找出了问题:vsprintf 包含下一个汇编代码
movdqa xmm0,XMMWORD PTR [rip+0x0]
The real problem is that I NEVER use floating point!真正的问题是我从不使用浮点!
I guess that was gcc's optimization, and It seems to be correct because It works well without optimization.我猜那是 gcc 的优化,它似乎是正确的,因为它在没有优化的情况下运行良好。
Is there any solution, so to speak, gcc option that disable optimization with xmm registers?是否有任何解决方案,可以这么说,禁用 xmm 寄存器优化的 gcc 选项?
The XMM register move instructions are generated, because in the System V AMD64 ABI , floating point arguments are stored in XMM0–XMM7.生成 XMM 寄存器移动指令,因为在System V AMD64 ABI 中,浮点参数存储在 XMM0–XMM7 中。
Since we don't know if floating points are used just by looking at the variadic function, the compiler needs to generate instructions to push the floating point values to the va_list
as well.由于仅通过查看可变参数函数我们不知道是否使用了浮点数,因此编译器还需要生成指令将浮点值推送到
va_list
中。
You could use the -mno-sse
flag to disable SSE .您可以使用
-mno-sse
标志禁用 SSE 。 For example,例如,
__attribute__((noinline))
void f(const char* x, ...) {
va_list va;
va_start(va, x);
vprintf(x, va);
va_end(va);
}
Without the -mno-sse
flag:没有
-mno-sse
标志:
subq $0x000000d8,%rsp
testb %al,%al
movq %rsi,0x28(%rsp)
movq %rdx,0x30(%rsp)
movq %rcx,0x38(%rsp)
movq %r8,0x40(%rsp)
movq %r9,0x48(%rsp)
je 0x100000f1b
movaps %xmm0,0x50(%rsp)
movaps %xmm1,0x60(%rsp)
movaps %xmm2,0x70(%rsp)
movaps %xmm3,0x00000080(%rsp)
movaps %xmm4,0x00000090(%rsp)
movaps %xmm5,0x000000a0(%rsp)
movaps %xmm6,0x000000b0(%rsp)
movaps %xmm7,0x000000c0(%rsp)
0x100000f1b:
leaq 0x000000e0(%rsp),%rax
movl $0x00000008,0x08(%rsp)
movq %rax,0x10(%rsp)
leaq 0x08(%rsp),%rsi
leaq 0x20(%rsp),%rax
movl $0x00000030,0x0c(%rsp)
movq %rax,0x18(%rsp)
callq 0x100000f6a ; symbol stub for: _vprintf
addq $0x000000d8,%rsp
ret
With the -mno-sse
flag:使用
-mno-sse
标志:
subq $0x58,%rsp
leaq 0x60(%rsp),%rax
movq %rsi,0x28(%rsp)
movq %rax,0x10(%rsp)
leaq 0x08(%rsp),%rsi
leaq 0x20(%rsp),%rax
movq %rdx,0x30(%rsp)
movq %rcx,0x38(%rsp)
movq %r8,0x40(%rsp)
movq %r9,0x48(%rsp)
movl $0x00000008,0x08(%rsp)
movq %rax,0x18(%rsp)
callq 0x100000f6a ; symbol stub for: _vprintf
addq $0x58,%rsp
ret
You could also use the target
attribute to disable SSE just for that function, eg您还可以使用
target
属性仅为该功能禁用 SSE,例如
__attribute__((noinline, target("no-sse")))
// ^^^^^^^^^^^^^^^^
void f(const char* x, ...) {
va_list va;
va_start(va, x);
vprintf(x, va);
va_end(va);
}
But be warned that other functions with SSE support won't know f
doesn't use SSE , and thus calling them with floating point numbers will cause undefined behavior :但请注意,具有 SSE 支持的其他函数不会知道
f
不使用 SSE ,因此使用浮点数调用它们将导致未定义的行为:
int main() {
f("%g %g", 1.0, 2.0); // 1.0 and 2.0 are stored in XMM0–1
// So this will print garbage e.g. `0 6.95326e-310`
}
使用 -O2 而不是 -O3 它将起作用。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.