简体   繁体   English

编译器如何确定具有编译器生成的临时函数的函数所需的堆栈大小?

[英]How does the compiler determine the needed stack size for a function with compiler generated temporaries?

Consider following code: 考虑以下代码:

class cFoo {
    private:
        int m1;
        char m2;
    public:
        int doSomething1();
        int doSomething2();
        int doSomething3();
}

class cBar {
    private:
        cFoo mFoo;
    public:
        cFoo getFoo(){ return mFoo; }
}

void some_function_in_the_callstack_hierarchy(cBar aBar) {
    int test1 = aBar.getFoo().doSomething1();
    int test2 = aBar.getFoo().doSomething2();
    ...
}

In the line where getFoo() is called the compiler will generate a temporary object of cFoo, to be able to call doSomething1(). 在调用getFoo()的行中,编译器将生成cFoo的临时对象,以便能够调用doSomething1()。 Does the compiler reuse the stack memory which is used for these temporary objects? 编译器是否重用用于这些临时对象的堆栈内存? How many stack memory will the call of "some_function_in_the_callstack_hierarchy" reservate? “some_function_in_the_callstack_hierarchy”的调用将保留多少堆栈内存? Does it reservate memory for every generated temporary? 是否为每个生成的临时存储内存?

My guess was that the compiler only reserve memory for one object of cFoo and will reuse the memory for different calls, but if I add 我的猜测是编译器只为cFoo的一个对象保留内存,并将重用内存用于不同的调用,但如果我添加

    int test3 = aBar.getFoo().doSomething3();

I can see that the needed stack size for "some_function_in_the_callstack_hierarchy" is way more and its not only because of the additional local int variable. 我可以看到“some_function_in_the_callstack_hierarchy”所需的堆栈大小更多,而且不仅仅是因为附加的本地int变量。

On the other hand if i then replace 另一方面,如果我然后更换

cFoo getFoo(){ return mFoo; }

with a reference (Only for testing purpose, because returning a reference to a private member is not good) 带引用(仅用于测试目的,因为返回对私有成员的引用不好)

const cFoo& getFoo(){ return mFoo; }

it needs way less stack memory, than the size of one cFoo. 它需要的堆栈内存少于一个cFo​​o的大小。

So for me it seems that the compiler reserves extra stack memory for every generated temporary object in the function. 所以对我来说,似乎编译器为函数中的每个生成的临时对象保留了额外的堆栈内存。 But this would be very inefficient. 但这样效率很低。 Can someone explain this? 有人可以解释一下吗?

The optimizing compiler is transforming your source code into some internal representation, and normalizing it. 优化编译器正在将您的源代码转换为一些内部表示,并对其进行规范化。

With free software compilers (like GCC & Clang/LLVM ), you are able to look into that internal representation (at the very least by patching the compiler code or running it in some debugger). 使用免费软件编译器(如GCCClang / LLVM ),您可以查看内部表示(至少通过修补编译器代码或在某些调试器中运行它)。

BTW, sometimes, temporary values do not even need any stack space, eg because they have been optimized, or because they can sit in registers. BTW,有时候,临时值甚至不需要任何堆栈空间,例如因为它们已被优化,或者因为它们可以位于寄存器中。 And quite often they would reuse some unneeded slot in the current call frame. 而且他们经常会在当前的调用帧中重用一些不需要的插槽。 Also (particularly in C++) a lot of (small) functions are inlined -like your getFoo probably is- (so they don't have any call frame themselves). 另外(特别是在C ++中)很多(小)函数都是内联的 - getFoo你的getFoo可能是 - (所以他们自己没有任何调用框架)。 Recent GCC are even sometimes able of tail-call optimizations (essentially, reusing the caller's call frame). 最近的GCC甚至有时能够进行尾调用优化(实质上是重用调用者的调用帧)。

If you compile with GCC (ie g++ ) I would suggest to play with optimization options and developer options (and some others). 如果您使用GCC(即g++ )进行编译,我建议您使用优化选项开发人员选项 (以及其他一些选项 )。 Perhaps use also -Wstack-usage=48 (or some other value, in bytes per call frame) and/or -fstack-usage 也许使用-Wstack-usage=48 (或其他一些值,每个调用帧的字节数)和/或-fstack-usage

First, if you can read assembler code, compile yourcode.cc with g++ -S -fverbose-asm -O yourcode.cc and look into the emitted yourcode.s 首先,如果可以读取汇编代码,编译yourcode.ccg++ -S -fverbose-asm -O yourcode.cc和窥视发射yourcode.s

(don't forget to play with optimization flags, so replace -O with -O2 or -O3 ....) (不要忘记使用优化标志,所以将-O替换为-O2-O3 ....)

Then, if you are more curious about how the compiler is optimizing, try g++ -O -fdump-tree-all -c yourcode.cc and you'll get a lot of so called "dump files" which contain a partial textual rendering of internal representations relevant to GCC. 然后,如果您对编译器的优化方式更加好奇,请尝试g++ -O -fdump-tree-all -c yourcode.cc ,您将获得许多所谓的“转储文件”,其中包含部分文本呈现与GCC相关的内部陈述。

If you are even more curious, look into my GCC MELT and notably its documentation page (which contains a lot of slides & references). 如果您更加好奇,请查看我的GCC MELT ,特别是其文档页面(其中包含大量幻灯片和参考文献)。

So for me it seems that the compiler reserves extra stack memory for every generated temporary object in the function. 所以对我来说,似乎编译器为函数中的每个生成的临时对象保留了额外的堆栈内存。

Certainly not, in the general case (and of course assuming you enable some optimizations). 当然不是,在一般情况下(当然假设你启用了一些优化)。 And even if some space is reserved, it would be very quickly reused. 即使保留了一些空间,也可以很快地重复使用。

BTW: notice that the C++11 standard does not speak of stack. 顺便说一句:请注意,C ++ 11标准没有提到堆栈。 One could imagine some C++ program compiled without using any stack (eg a whole program optimization detecting a program without recursion whose stack space and layout could be optimized to avoid any stack. I don't know any such compiler, but I do know that compilers can be quite clever....) 可以想象一些C ++程序在没有使用任何堆栈的情况下编译(例如,整个程序优化检测到没有递归的程序,其堆栈空间和布局可以优化以避免任何堆栈。我不知道任何这样的编译器,但我知道编译器可以很聪明....)

Attempting to analyse how a compiler is going to treat a particular piece of code is getting progressively more difficult as optimisation strategies get more aggressive. 随着优化策略变得更加激进,尝试分析编译器如何处理特定代码片段变得越来越困难。

All a compiler has to do is implement the C++ standard and compile the code without introducing or cancelling any side-effects (with some exceptions such as return and named return value optimisation). 编译器所要做的就是实现C ++标准并编译代码而不引入或取消任何副作用(有一些例外,例如返回和命名返回值优化)。

You can see from your code that, since cFoo is not a polymorphic type and has no member data, a compiler could optimise out the creation of an object altogether and call what are essentially therefore static functions directly. 您可以从代码中看到,由于cFoo不是多态类型且没有成员数据,因此编译器可以完全优化对象的创建,并直接调用基本上是static函数的东西。 I'd imagine that even at the time of my writing, some compilers are already doing that. 我想,即使在我写作的时候,一些编译器已经在做了。 You could always check the output assembly to be sure. 您可以随时检查输出组件。

Edit: The OP has now introduced class members. 编辑:OP现在已经引入了类成员。 But since these are never initialised and are private , the compiler can remove them without thinking too hard about that. 但由于这些从未初始化并且是private ,因此编译器可以删除它们而不必过于考虑。 This answer therefore still applies. 因此,这个答案仍然适用。

Life time of a temporary object is up until the end of the full containing expression , see the paragraph "12.2 Temporary objects" of the Standard. 临时对象的生命周期一直持续到完整包含表达式结尾 ,请参阅标准的“12.2临时对象”段落。

It is very unlikely that even with the lowest optimisation settings a compiler will not reuse the space after the end of life of a temporary object. 即使使用最低优化设置,编译器也不太可能在临时对象的生命周期结束后重用该空间。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM