[英]GCC optimizer generating error in nostdlib code
I have the following code:我有以下代码:
void cp(void *a, const void *b, int n) {
for (int i = 0; i < n; ++i) {
((char *) a)[i] = ((const char *) b)[i];
}
}
void _start(void) {
char buf[20];
const char m[] = "123456789012345";
cp(buf, m, 15);
register int rax __asm__ ("rax") = 60; // exit
register int rdi __asm__ ("rdi") = 0; // status
__asm__ volatile (
"syscall" :: "r" (rax), "r" (rdi) : "cc", "rcx", "r11"
);
__builtin_unreachable();
}
If I compile it with gcc -nostdlib -O1 "./ac" -o "./a"
, I get a functioning program, but if I compile it with -O2
, I get a program that generates a segmentation fault.如果我用
gcc -nostdlib -O1 "./ac" -o "./a"
编译它,我得到一个正常运行的程序,但如果我用-O2
编译它,我得到一个产生分段错误的程序。
This is the generated code with -O1
:这是使用
-O1
生成的代码:
0000000000001000 <cp>:
1000: b8 00 00 00 00 mov $0x0,%eax
1005: 0f b6 14 06 movzbl (%rsi,%rax,1),%edx
1009: 88 14 07 mov %dl,(%rdi,%rax,1)
100c: 48 83 c0 01 add $0x1,%rax
1010: 48 83 f8 0f cmp $0xf,%rax
1014: 75 ef jne 1005 <cp+0x5>
1016: c3 retq
0000000000001017 <_start>:
1017: 48 83 ec 30 sub $0x30,%rsp
101b: 48 b8 31 32 33 34 35 movabs $0x3837363534333231,%rax
1022: 36 37 38
1025: 48 ba 39 30 31 32 33 movabs $0x35343332313039,%rdx
102c: 34 35 00
102f: 48 89 04 24 mov %rax,(%rsp)
1033: 48 89 54 24 08 mov %rdx,0x8(%rsp)
1038: 48 89 e6 mov %rsp,%rsi
103b: 48 8d 7c 24 10 lea 0x10(%rsp),%rdi
1040: ba 0f 00 00 00 mov $0xf,%edx
1045: e8 b6 ff ff ff callq 1000 <cp>
104a: b8 3c 00 00 00 mov $0x3c,%eax
104f: bf 00 00 00 00 mov $0x0,%edi
1054: 0f 05 syscall
And this is the generated code with -O2
:这是使用
-O2
生成的代码:
0000000000001000 <cp>:
1000: 31 c0 xor %eax,%eax
1002: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
1008: 0f b6 14 06 movzbl (%rsi,%rax,1),%edx
100c: 88 14 07 mov %dl,(%rdi,%rax,1)
100f: 48 83 c0 01 add $0x1,%rax
1013: 48 83 f8 0f cmp $0xf,%rax
1017: 75 ef jne 1008 <cp+0x8>
1019: c3 retq
101a: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
0000000000001020 <_start>:
1020: 48 8d 44 24 d8 lea -0x28(%rsp),%rax
1025: 48 8d 54 24 c9 lea -0x37(%rsp),%rdx
102a: b9 31 00 00 00 mov $0x31,%ecx
102f: 66 0f 6f 05 c9 0f 00 movdqa 0xfc9(%rip),%xmm0 # 2000 <_start+0xfe0>
1036: 00
1037: 48 8d 70 0f lea 0xf(%rax),%rsi
103b: 0f 29 44 24 c8 movaps %xmm0,-0x38(%rsp)
1040: eb 0d jmp 104f <_start+0x2f>
1042: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1)
1048: 0f b6 0a movzbl (%rdx),%ecx
104b: 48 83 c2 01 add $0x1,%rdx
104f: 88 08 mov %cl,(%rax)
1051: 48 83 c0 01 add $0x1,%rax
1055: 48 39 f0 cmp %rsi,%rax
1058: 75 ee jne 1048 <_start+0x28>
105a: b8 3c 00 00 00 mov $0x3c,%eax
105f: 31 ff xor %edi,%edi
1061: 0f 05 syscall
The crash happens at 103b
, instruction movaps %xmm0,-0x38(%rsp)
.崩溃发生在
103b
,指令movaps %xmm0,-0x38(%rsp)
。
I noticed that if m
contains less than 15 characters, then the generated code is different and the crash does not happen.我注意到如果
m
包含少于 15 个字符,那么生成的代码会有所不同,并且不会发生崩溃。
What am I doing wrong?我究竟做错了什么?
_start
is not a function. _start
不是 function。 It's not called by anything, and on entry the stack is 16-byte aligned , not (as the ABI requires) 8 bytes away from 16-byte alignment.它没有被任何东西调用,并且在入口处堆栈是 16 字节对齐的,而不是(按照 ABI 的要求)距离 16 字节 alignment 8 个字节。
(The ABI requires 16-byte alignment before a call
, and call
pushes an 8-byte return address. So on function entry RSP-8 and RSP+8 are 16-byte aligned.) (ABI在
call
之前需要16字节的alignment,并且call
推送一个8字节的返回地址。所以在function条目上,RSP-8和RSP+8是16字节对齐的。)
At -O2
GCC uses alignment-required 16-byte instructions to implement the copy done by cp()
, copying the "123456789012345"
from static storage to the stack.在
-O2
GCC 使用需要对齐的 16 字节指令来实现由cp()
完成的复制,将"123456789012345"
从 static 存储复制到堆栈。
At -O1
, GCC just uses two mov r64, imm64
instructions to get bytes into integer regs for 8-byte stores.在 -O1 ,
-O1
只使用两个mov r64, imm64
指令将字节获取到 integer regs 用于 8 字节存储。 These don't require alignment.这些不需要 alignment。
Just write a main
in C like a normal person if you want everything to work.如果您希望一切正常,只需像普通人一样在 C 中编写一个
main
。
Or if you're trying to microbenchmark something light-weight in asm, you can use gcc -nostdlib -O3 -mincoming-stack-boundary=3
( docs ) to tell GCC that functions can't assume they're called with more than 8-byte alignment.或者,如果你想在 asm 中对轻量级的东西进行微基准测试,你可以使用
gcc -nostdlib -O3 -mincoming-stack-boundary=3
( docs ) 告诉 GCC 函数不能假设它们被调用超过8 字节 alignment。 Unlike -mpreferred-stack-boundary=3
, this will still align by 16 before making further calls.与
-mpreferred-stack-boundary=3
不同,在进行进一步调用之前,它仍将对齐 16。 So if you have other non-leaf functions, you might want to just use an attribute on your hacky C _start()
instead of affecting the whole file.因此,如果您有其他非叶函数,您可能只想在您的 hacky C
_start()
上使用一个属性,而不是影响整个文件。
A worse, more hacky way would be to try putting更糟糕,更hacky的方法是尝试放置
asm("push %rax");
at the very top of _start
to modify RSP by 8, where GCC hopefully runs it before doing anything else with the stack.在
_start
的最顶部将 RSP 修改为 8,其中 GCC 希望在对堆栈执行任何其他操作之前运行它。 GNU C Basic asm statements are implicitly volatile
so you don't need asm volatile
, although that wouldn't hurt. GNU C 基本的 asm 语句是隐式的
volatile
,所以你不需要asm volatile
,尽管这不会有坏处。
You're 100% on your own and responsible for correctly tricking the compiler by using inline asm that works for whatever optimization level you're using.您是 100% 靠自己的,并且负责通过使用适用于您正在使用的任何优化级别的内联汇编来正确欺骗编译器。
Another safer way would be write your own light-weight _start
that calls main:另一种更安全的方法是编写自己的轻量级
_start
调用 main:
// at global scope:
asm(
".globl _start \n"
"_start: \n"
" mov (%rsp), %rdi \n" // argc
" lea 8(%rsp), %rsi \n" // argv
" lea 8(%rsi, %rdi, 8), %rdx \n" // envp
" call main \n"
// NOT DONE: stdio cleanup or other atexit stuff
// DO NOT USE WITH GLIBC; use libc's CRT code if you use libc
" mov %eax, %edi \n"
" mov $231, %eax \n"
" syscall" // exit_group( main() )
);
int main(int argc, char**argv, char**envp) {
... your code here
return 0;
}
If you didn't want main
to return, you could just pop %rdi
;如果你不想
main
返回,你可以pop %rdi
; mov %rsp, %rsi
; mov %rsp, %rsi
; jmp main
to give it argc and argv without a return address. jmp main
给它 argc 和 argv 没有返回地址。
Then main
can exit via inline asm, or by calling exit()
or _exit()
if you link libc.然后
main
可以通过内联 asm 退出,或者如果链接 libc,则通过调用exit()
或_exit()
退出。 (But if you link libc, you should usually use its _start
.) (但如果你链接 libc,你通常应该使用它的
_start
。)
See also: How Get arguments value using inline assembly in C without Glibc?另请参阅: 如何在没有 Glibc 的情况下使用 C 中的内联汇编获取 arguments 值? for other hand-rolled
_start
versions;对于其他手卷
_start
版本; this is pretty much like @zwol's there.这很像@zwol's there。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.