简体   繁体   English

为什么LLVM会分配一个冗余变量?

[英]Why does LLVM allocate a redundant variable?

Here's a simple C file with an enum definition and a main function:这是一个带有枚举定义和main函数的简单 C 文件:

enum days {MON, TUE, WED, THU};

int main() {
    enum days d;
    d = WED;
    return 0;
}

It transpiles to the following LLVM IR:它转换为以下 LLVM IR:

define dso_local i32 @main() #0 {
  %1 = alloca i32, align 4
  %2 = alloca i32, align 4
  store i32 0, i32* %1, align 4
  store i32 2, i32* %2, align 4
  ret i32 0
}

%2 is evidently the d variable, which gets 2 assigned to it. %2显然是d变量,它被分配了 2。 What does %1 correspond to if zero is returned directly?如果直接返回零, %1对应什么?

This %1 register was generated by clang to handle multiple return statements in a function .这个%1寄存器是由 clang 生成的,用于处理函数中的多个返回语句 Imagine you were writing a function to compute an integer's factorial.想象一下,您正在编写一个函数来计算整数的阶乘。 Instead of this而不是这个

int factorial(int n){
    int result;
    if(n < 2)
      result = 1;
    else{
      result = n * factorial(n-1);
    }
    return result;
}

You'd probably do this你可能会这样做

int factorial(int n){
    if(n < 2)
      return 1;
    return n * factorial(n-1);
}

Why?为什么? Because Clang will insert that result variable that holds the return value for you.因为 Clang 会插入保存返回值的result变量。 Yay.好极了。 That's the reason for that %1 variable.这就是%1变量的原因。 Look at the ir for a slightly modified version of your code.查看 ir 以获取稍微修改过的代码版本。

Modified code,修改后的代码,

enum days {MON, TUE, WED, THU};

int main() {
    enum days d;
    d = WED;
    if(d) return 1;
    return 0;
}

IR,红外,

define dso_local i32 @main() #0 !dbg !15 {
    %1 = alloca i32, align 4
    %2 = alloca i32, align 4
    store i32 0, i32* %1, align 4
    store i32 2, i32* %2, align 4, !dbg !22
    %3 = load i32, i32* %2, align 4, !dbg !23
    %4 = icmp ne i32 %3, 0, !dbg !23
    br i1 %4, label %5, label %6, !dbg !25

 5:                                                ; preds = %0
   store i32 1, i32* %1, align 4, !dbg !26
   br label %7, !dbg !26

 6:                                                ; preds = %0
  store i32 0, i32* %1, align 4, !dbg !27
  br label %7, !dbg !27

 7:                                                ; preds = %6, %5
  %8 = load i32, i32* %1, align 4, !dbg !28
  ret i32 %8, !dbg !28
}

Now you see %1 making itself useful huh?现在你看到%1变得有用了吧? Most functions with a single return statement will have this variable stripped by one of llvm's passes.大多数具有单个 return 语句的函数都将通过 llvm 的一个传递删除此变量。

Why does this matter — what's the actual problem?为什么这很重要——实际问题是什么?

I think the deeper answer you're looking for might be: LLVM's architecture is based around fairly simple frontends and many passes.我认为您正在寻找的更深层次的答案可能是:LLVM 的架构基于相当简单的前端和许多通道。 The frontends have to generate correct code, but it doesn't have to be good code.前端必须生成正确的代码,但它不一定是好的代码。 They can do the simplest thing that works.他们可以做最简单的事情。

In this case, Clang generates a couple of instructions that turn out not to be used for anything.在这种情况下,Clang 生成了几条指令,但结果证明它们不会用于任何事情。 That's generally not a problem, because some part of LLVM will get rid of superfluous instructions.这通常不是问题,因为 LLVM 的某些部分会去掉多余的指令。 Clang trusts that to happen. Clang 相信这会发生。 Clang doesn't need to avoid emitting dead code; Clang 不需要避免发出死代码; its implementation may focus on correctness, simplicity, testability, etc.它的实现可能侧重于正确性、简单性、可测试性等。

Because Clang is done with syntax analysis but LLVM hasn't even started with optimization.因为 Clang 已经完成了语法分析,而 LLVM 甚至还没有开始优化。

The Clang front end has generated IR (Intermediate Representation) and not machine code. Clang 前端生成了 IR(中间表示)而不是机器代码。 Those variables are SSAs (Single Static Assignments);这些变量是 SSA(单一静态分配); they haven't been bound to registers yet and actually after optimization, never will be because they are redundant.它们尚未绑定到寄存器,实际上经过优化后,永远不会绑定到寄存器,因为它们是多余的。

That code is a somewhat literal representation of the source.该代码是源代码的某种字面表示。 It is what clang hands to LLVM for optimization.这就是叮当交给 LLVM 进行优化。 Basically, LLVM starts with that and optimizes from there.基本上,LLVM 以此开始并从那里优化。 Indeed, for version 10 and x86_64, llc -O2 will eventually generate:实际上,对于版本 10 和 x86_64, llc -O2最终会生成:

main: # @main
  xor eax, eax
  ret

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM