简体   繁体   中英

Cygwin64 gcc C/assembler crash when using -O2 vs -O1

I have the following code (two files):

main.c

#include <stdio.h>

void print();
void asm_print();

int main() {
    asm_print();
    printf("done\n");

    return 0;
}

void print() {
    printf("printing with number: %d\n", 1);
    // printf("printing without number\n");
}

lib.s

    .intel_syntax noprefix
    .text

    .globl asm_print
asm_print:
    push    rbp
    mov     rbp, rsp
    call    print
    mov     rsp, rbp
    pop     rbp
    ret

expected output

printing with number: 1
done

If I compile on linux using gcc4.9.3 and the command line:

gcc -O2 -m64 -Wall -Wextra -Werror -std=gnu99 main.c lib.s

everything works fine. This also works if I use –O1 or –O3. If I compile on cygwin64 using gcc4.9.3 and the command line:

gcc –O1 -m64 -Wall -Wextra -Werror -std=gnu99 main.c lib.s

everything works fine.

If, however, I change the above to use –O2 or –O3, only the first line of output is produced. If in the function print() I comment out the first line and uncomment the second line, I get the expected output:

printing without number
done

regardless of the amount of optimization I use. Can anyone suggest what is wrong with the code such that I get the expected output regardless of the amount of optimization used on cygwin64?

The problem is that the windows 64 bit ABI is different than the 32-bit ABI, and requires the caller to allocate scratch parameter (home) space of 32 bytes on the stack for use by the callee.

http://eli.thegreenplace.net/2011/09/06/stack-frame-layout-on-x86-64/ https://msdn.microsoft.com/en-us/library/tawsa7cb.aspx

So what you need to do is to decrement the stack by at least 32 before the call. In addition, x64 requires maintaining the stack pointer on a multiple of 16 bytes. The 64 bit return address is 8 bytes, so you actually need to move rsp by 40, 56, etc.

asm_print:
    push    rbp
    mov     rbp, rsp
    sub     rsp, 40
    call    print
    add     rsp, 40
    pop     rbp
    ret

Presumably, when you call print / printf with just a string constant, it doesn't actually use any of the scratch space, so nothing bad happens. On the other hand, when you use the %d format, it needs the parameter space, and clobbers your saved registers on the stack.

The reason it works with optimization disabled, is that the print function doesn't use the home space, and allocates parameter space when it calls printf. If you use -O2, the compiler does tail-call elimination and replaces the "call printf" instruction with a "jmp printf". This essentially results in re-using the parameter space that was supposed to be allocated by asm_print.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM