简体   繁体   English

gcc对alloca的处理有什么用?

[英]What's up with gcc's handling of alloca?

On most platforms, alloca just boils down to an inline adjustment of the stack pointer (for example, subtracting from rsp on x64, plus a bit of logic to maintain stack alignment). 在大多数平台上, alloca只是归结为堆栈指针的内联调整(例如,从x64上的rsp减去,加上一些逻辑来维持堆栈对齐)。

I was looking at the code that gcc generates for alloca and it is pretty weird. 我正在查看gcc为alloca生成的代码,这很奇怪。 Take the following simple example 1 : 采用以下简单示例1

#include <alloca.h>
#include <stddef.h>

volatile void *psink;

void func(size_t x) {
  psink = alloca(x);
}

This compiles to the following assembly at -O2 : 这将编译为-O2的以下程序集:

func(unsigned long):
        push    rbp
        add     rdi, 30
        and     rdi, -16
        mov     rbp, rsp
        sub     rsp, rdi
        lea     rax, [rsp+15]
        and     rax, -16
        mov     QWORD PTR psink[rip], rax
        leave
        ret

There are several confusing things here. 这里有几个令人困惑的事情。 I understand that gcc needs to round the allocated size up to a multiple of 16 (to maintain stack alignment), and the usual way to do that would be (size + 15) & ~0xF but instead it adds 30 at add rdi, 30 ? 我知道gcc需要将分配的大小舍入到16的倍数(以保持堆栈对齐),通常的方法是(size + 15) & ~0xF但是在add rdi, 30add rdi, 30 What's up with that? 那是怎么回事?

Second, I would just expect the result of alloca to be the new rsp value, which is already well-aligned. 其次,我只希望alloca的结果是新的rsp值,它已经很好地对齐了。 Instead, gcc does this: 相反,gcc这样做:

    lea     rax, [rsp+15]
    and     rax, -16

Which seems to be "realigning" the value of rsp to use as the result of alloca - but we already did the work to align rsp to a 16-byte boundary in the first place. 这似乎是“重新调整” rsp的值以用作alloca的结果 - 但我们已经完成了将rsp与16字节边界对齐的工作。

What's up with that? 那是怎么回事?

You can play with the code on godbolt . 你可以在godbolt上玩代码。 It is worth noting that clang and icc do the "expected thing" on x86 at least. 值得注意的是, clangicc至少在x86上做了“预期的事情”。 With VLAs (as suggested in earlier comments), gcc and clang does fine while icc produces an abomination. 使用VLA(如前面评论中所述), gccclang良好,而icc产生憎恶。


1 Here, the assignment to psink is just to consume the result of alloca since otherwise the compiler just omits it entirely. 1这里,对psink的赋值只是为了消耗alloca的结果,否则编译器就完全省略了它。

This is a very old, normal priority bug . 这是一个非常古老的普通优先级错误 The code works correctly. 代码工作正常。 It's just that when the size is larger than 1 byte, 16 more bytes are unnecessarily allocated. 只是当大小大于1个字节时,不必要地分配16个字节。 So it's not a correctness bug, it's a minor efficiency bug. 所以这不是一个正确性错误,它是一个小的效率错误。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM