简体   繁体   中英

How do I stop GCC stripping trailing newline from string literal in obj file?

Working under Linux, i just met the following issue. (For sure, someone will give me the answer, but up to now,i didn't find any simple and clear answer :)

/*compile with gcc -o out.x hello.c*/

#include<stdio.h>

int main()
{
    printf("Hello World2\r\n");
    printf("Hello World3\r\n ");

 return 0;
}

Running the following code under Linux give two strings BUT the ending char are differents: the first output ends with 0x0d while the 2nd ends with 0x0d,0x0a.

This is something done by the compiler (GCC) as you can see in the obj file:

Contents of section .rodata:
 400610 01000200 48656c6c 6f20576f 726c6432  ....Hello World2
 400620 0d004865 6c6c6f20 576f726c 64330d0a  ..Hello World3..
 400630 2000                                  .              

So, questions are:

  • Why ?
  • How can i avoid this kind of "optimization"(!?)

Thanks

Creating formatted output at runtime takes time; the printf call is slow. GCC knows this, so replaces the first function with a call to puts . Since puts automatically adds a \\n , GCC needs to remove the \\n from the string to compensate.

GCC does this because it considers printf a built-in . Because this has no effect on the bytes output or even on the number of calls to write ; I strongly recommend leaving it as-is. If you do want to disable it, you can pass -fno-builtin-printf , but the only effect will be to slow down your code as it tries to format the string unnecessarily.

It is simpler to ask GCC (using GCC7.2 on Linux/Debian/Sid/x86-64) to emit assembler. So I compiled your program bflash.c with

gcc -fverbose-asm -O0 -S bflash.c -o bflash-O0.S

to get it without optimization, and with

gcc -fverbose-asm -O1 -S bflash.c -o bflash-O1.S

to get -O1 optimization. Feel free to repeat the experiment with various other optimization flags .

Even without optimization, the bflash-O0.S contains:

    .section    .rodata
.LC0:
    .string "Hello World2\r"
.LC1:
    .string "Hello World3\r\n "
    .text
    .globl  main
    .type   main, @function
main:
.LFB0:
    .cfi_startproc
    pushq   %rbp    #
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp  #,
    .cfi_def_cfa_register 6
# bflash.c:5:     printf("Hello World2\r\n");
    leaq    .LC0(%rip), %rdi    #,
    call    puts@PLT    #
# bflash.c:6:     printf("Hello World3\r\n ");
    leaq    .LC1(%rip), %rdi    #,
    movl    $0, %eax    #,
    call    printf@PLT  #
# bflash.c:8:  return 0;
    movl    $0, %eax    #, _4
# bflash.c:9: }
    popq    %rbp    #
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE0:
    .size   main, .-main

As you see, the first printf has been optimized as a puts ; and this is permitted by the C11 standard n1570 ( as-if rule ). BTW, the bflash-01.S contains similar code. Notice that the C11 standard has been specified with current optimization practices in mind (many members of the standardization committees are compiler implementors).

BTW Clang 5, invoked as clang-5.0 -O1 -fverbose-asm -S bflash.c -o bflash-01clang.s , performs the same kind of optimization.

How can i avoid this kind of "optimization"(!?)

Follow Daniel H's answer (and you might compile with -ffreestanding , but I don't recommend that).

Or avoid using printf from the <stdio.h> and implement your own slower printing function. If you implement your own printing function, name it differently (since printf is defined in the C11 standard), and perhaps (if so wanted) write your own GCC plugin to optimize it your way (and that plugin should better be some free software which is GPL compatible , read the GCC runtime library exception ).

The C language specification (study n1570 ) defines a semantics , that is the behavior of your compiled program. It does not require any particular sequence of bytes to appear in the executable (which is probably not even mentioned in the standard). If you need such a property, find a different programming language, and give up all the important optimizations GCC is trying hard to do for you. Optimizations are what is making writing a C compiler difficult (if you want a non-optimizing compiler, use something else than GCC, but accept to lose perhaps a factor of three or more in performance, wrt code compiled with gcc -O2 ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM