简体   繁体   English

Apple:使用 -O0 与 -O2(内核)编译 clang 帧大小

[英]Apple: Compile clang frame size with -O0 vs -O2 (kernel)

I have an existing project, which we compile DEBUG for developers (and -O0 so lldb makes sense).我有一个现有项目,我们为开发人员编译 DEBUG(和 -O0 所以 lldb 有意义)。 But I have one function in particular that balloons in size when -O0 is used:但是我有一个 function 特别是当使用 -O0 时气球的大小:

-O2 -Wframe-larger-than=100
warning: stack frame size of 168 bytes in function 'dsl_scan_visitbp'
-O0 -Wframe-larger-than=100
warning: stack frame size of 1160 bytes in function 'dsl_scan_visitbp'

and with some recursion, the stack can be very trashed (16K stacks in kernel).并且通过一些递归,堆栈可能会非常垃圾(内核中的 16K 堆栈)。

First thing to inspect are any local variables, but I believe there are only two:首先要检查的是任何局部变量,但我相信只有两个:

        dsl_pool_t *dp = scn->scn_dp;
        blkptr_t *bp_toread = NULL;

If you want to see the whole function: https://github.com/openzfs/zfs/blob/master/module/zfs/dsl_scan.c#L1908 (Linux sources, but dealing with Apple clang port) If you want to see the whole function: https://github.com/openzfs/zfs/blob/master/module/zfs/dsl_scan.c#L1908 (Linux sources, but dealing with Apple clang port)

There are a bunch of alwaysinline in that sourcefile, which may also come to play here.那个源文件里有一堆alwaysinline ,可能也来这里玩。

But I am curious why it grows so large with -O0?但我很好奇为什么它会随着 -O0 变大?

Then what to do about it, I can't see any Apple-clang #pragmas to turn "on" optimize in a source file (only turning off optimize) for one function, or one file.然后该怎么办,我看不到任何Apple-clang #pragmas在一个function或一个文件的源文件中打开“优化”(仅关闭优化)。 If I knew what the cause was, perhaps I can control that specific issue with a different pragma.如果我知道原因是什么,也许我可以用不同的编译指示来控制那个特定的问题。

Only solution I see right now, is to have dsl_scan.c processed differently in the Makefile, so that only that file always gets -O2.我现在看到的唯一解决方案是让dsl_scan.c在 Makefile 中以不同方式处理,以便只有该文件始终获得 -O2。 But that is a bit tedious.但这有点乏味。

I'm not familiar with the code base, so I don't see any obvious variables that would be taking large amounts of stack space.我不熟悉代码库,所以我看不到任何会占用大量堆栈空间的明显变量。 However, I notice that the functions (including the always_inline d) are quite long.但是,我注意到函数(包括always_inline d)很长。 Typically, in debug builds, every variable and temporary expression result is assigned a unique space in the stack frame, regardless of scope.通常,在调试版本中,无论 scope 是什么,都会在堆栈帧中为每个变量和临时表达式结果分配一个唯一的空间。 So even if 2 variables' lifetimes do not overlap (eg one is declared in the if block, and another in the else block) they will be allocated separate spaces in memory.因此,即使 2 个变量的生命周期不重叠(例如,一个在if块中声明,另一个在else块中声明),它们将在 memory 中分配单独的空间。 So this can add up even if there are a lot of small short-lived variables and temporary values.所以即使有很多小的短期变量和临时值,这也会累加。

You are probably best off disabling always_inline attributes in all functions called by this function in debug builds, as this avoids pre-allocating memory for all possible branches of execution even if they are never taken, or if they are declared in a function that's not involved in the recursion. You are probably best off disabling always_inline attributes in all functions called by this function in debug builds, as this avoids pre-allocating memory for all possible branches of execution even if they are never taken, or if they are declared in a function that's not involved在递归中。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM