编译器将 printf 更改为 puts

Question

Consider the following code:考虑以下代码：

#include <stdio.h>

void foo() {
    printf("Hello world\n");
}

void bar() {
    printf("Hello world");
}

The assembly produced by both these two functions is:这两个函数产生的程序集是：

.LC0:
        .string "Hello world"
foo():
        mov     edi, OFFSET FLAT:.LC0
        jmp     puts
bar():
        mov     edi, OFFSET FLAT:.LC0
        xor     eax, eax
        jmp     printf

Now I know the difference between puts and printf , but I find this quite interesting that gcc is able to introspect the const char* and figure out whether to call printf or puts.现在我知道puts 和 printf之间的区别，但我发现 gcc 能够内省 const char* 并确定是调用 printf 还是 puts 非常有趣。

Another interesting thing is that in bar , compiler zero'ed out the return register ( eax ) even though it is a void function.另一个有趣的事情是，在bar ，编译器将返回寄存器 ( eax ) 清零，即使它是一个void函数。 Why did it do that there and not in foo ?为什么它在那里而不是在foo这样做？

Am I correct in assuming that compiler 'introspected my string', or there is another explanation of this?我假设编译器“内省了我的字符串”是否正确，或者对此有另一种解释？

Answer 1

Am I correct in assuming that compiler 'introspected my string', or there is another explanation of this?我假设编译器“内省了我的字符串”是否正确，或者对此有另一种解释？

Yes, this is exactly what happens.是的，这正是发生的事情。 It's a pretty simple and common optimization done by the compiler.这是由编译器完成的非常简单且常见的优化。

Since your first printf() call is just:由于您的第一个printf()调用只是：

printf("Hello world\n");

It's equivalent to:它相当于：

puts("Hello world");

Since puts() does not need to scan and parse the string for format specifiers, it's quite faster than printf() .由于puts()不需要扫描和解析字符串以获取格式说明符，因此它比printf()快得多。 The compiler notices that your string ends with a newline and does not contain format specifiers, and therefore automatically converts the call.编译器注意到您的字符串以换行符结尾并且不包含格式说明符，因此会自动转换调用。

This also saves a bit of space, since now only one string "Hello world" needs to be stored in the resulting binary.这也节省了一点空间，因为现在只需要在生成的二进制文件中存储一个字符串"Hello world" 。

Note that this is not possible in general for calls of the form:请注意，对于以下形式的调用，这通常是不可能的：

printf(some_var);

If some_var is not a simple constant string, the compiler cannot know if it ends in \\n .如果some_var不是简单的常量字符串，则编译器无法知道它是否以\\n结尾。

Other common optimizations are:其他常见的优化有：

strlen("constant string") might get evaluated at compile time and converted into a number. strlen("constant string")可能会在编译时被评估并转换为数字。
memmove(location1, location2, sz) might get transformed into memcpy() if the compiler is sure that location1 and location2 don't overlap.如果编译器确定location1和location2不重叠memmove(location1, location2, sz)可能会转换为memcpy() 。
memcpy() of small sizes can be converted in a single mov instruction, and even if the size is larger the call can sometimes be inlined to be faster.小尺寸的memcpy()可以在单个mov指令中转换，即使尺寸更大，有时也可以内联调用以更快。

Another interesting thing is that in bar , compiler zero'ed out the return register ( eax ) even though it is a void function.另一个有趣的事情是，在bar ，编译器将返回寄存器 ( eax ) 清零，即使它是一个void函数。 Why did it do that there and not in foo ?为什么它在那里而不是在foo这样做？

See here: Why is %eax zeroed before a call to printf?请参阅此处：为什么 %eax 在调用 printf 之前归零？

Related interesting posts相关有趣的帖子

Answer 2

Another interesting thing is that in bar, compiler zero'ed out the return register (eax) even though it is a void function.另一个有趣的事情是，在 bar 中，编译器将返回寄存器 (eax) 清零，即使它是一个 void 函数。 Why did it do that there and not in foo?为什么它在那里而不是在 foo 中这样做？

This is completely unrelated to the question in the title, but is interesting none the less.这与标题中的问题完全无关，但仍然很有趣。

The xor zeroing %eax is before the call to printf so is part of the call and has nothing to do with the return value.异或归零%eax在调用 printf之前，因此是调用的一部分，与返回值无关。 The reason this happens is that printf is a varargs function, and the x86_64 ABI for varargs function requires passing floating-point arguments in xmm registers, and requires passing the number of such arguments in %al.发生这种情况的原因是printf是一个可变参数函数，并且可变参数函数的 x86_64 ABI 需要在 xmm 寄存器中传递浮点参数，并且需要在 %al 中传递此类参数的数量。 So this instruction is there to ensure that %al is 0 as no arguments are being passed in xmm registers to printf.所以这条指令是为了确保 %al 为 0，因为没有参数在 xmm 寄存器中传递给 printf。

puts is not a varargs function, so it is not required there. puts 不是可变参数函数，因此不需要它。

编译器将 printf 更改为 puts

问题描述

2 个解决方案

解决方案1
10 已采纳 2020-02-05 16:29:42

Related interesting posts相关有趣的帖子

解决方案2
5 2021-10-12 17:09:33

编译器将 printf 更改为 puts

问题描述

2 个解决方案

解决方案1 10 已采纳 2020-02-05 16:29:42

Related interesting posts相关有趣的帖子

解决方案2 5 2021-10-12 17:09:33

解决方案1
10 已采纳 2020-02-05 16:29:42

解决方案2
5 2021-10-12 17:09:33