LLVM IR中函数的参数编号与源代码不同

Question

I would like to do some analysis on each function in LLVM IR. 我想对LLVM IR中的每个功能进行一些分析。 However, when I generate LLVM IR code from my example c code, I found that in some case, the argument number of a function is different from my example c code. 但是，当我从示例c代码生成LLVM IR代码时，我发现在某些情况下，函数的参数编号与示例c代码不同。 For example: 例如：

my example c code is as below: 我的示例C代码如下：

struct outer_s{
    int a;
    int b;
    int c;
};

void func_a(struct outer_s z){
    // nothing
}

however, the generated LLVM IR code is as below: 但是，生成的LLVM IR代码如下：

%struct.outer_s = type { i32, i32, i32 }

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @func_a(i64, i32) #0 {
  %3 = alloca %struct.outer_s, align 4
  %4 = alloca { i64, i32 }, align 4
  %5 = getelementptr inbounds { i64, i32 }, { i64, i32 }* %4, i32 0, i32 0
  store i64 %0, i64* %5, align 4
  %6 = getelementptr inbounds { i64, i32 }, { i64, i32 }* %4, i32 0, i32 1
  store i32 %1, i32* %6, align 4
  %7 = bitcast %struct.outer_s* %3 to i8*
  %8 = bitcast { i64, i32 }* %4 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 4 %7, i8* align 4 %8, i64 12, i1 false)
  ret void
}

As we have seen above, the "struct outer_s z" argument is split in two parts. 正如我们在上面看到的，“ struct external_s z”自变量分为两部分。 If I use API to get function arguments, the argument number that I get is also 2. 如果我使用API来获取函数参数，那么我得到的参数编号也是2。

As my analysis is start from function arguments and wrong argument num will cause wrong result. 由于我的分析是从函数参数开始的，因此错误的参数num会导致错误的结果。 So I'm wondering if there is any LLVM Pass or clang argument that I can use to avoid those "split" case? 因此，我想知道是否可以使用LLVM Pass或clang参数来避免出现“拆分”情况？

Answer 1

Short answer: no, you need to deal with it. 简短的回答：不，您需要处理它。

Long answer: no, you need to deal with it. 长答案：不，您需要处理它。 Even worse, the IR here is very target-dependent as the argument passing rules are defined by platform ABI. 更糟糕的是，这里的IR非常依赖于目标，因为参数传递规则是由平台ABI定义的。 Here are many complications as that they are often declared in terms of a source language. 这是许多并发症，因为它们通常是根据源语言声明的。 So, we need to find a way to model these rules via LLVM IR which is usually much low level as compared to the original. 因此，我们需要找到一种通过LLVM IR为这些规则建模的方法，该方法通常比原始方法低很多。 Often this is quite a non-trivial process (eg for passing struct by value, vectors, homogeneous aggregates, etc.) and the process might be in some sense "destructive" to the original sources and the mapping is not one-to-one. 通常，这是一个相当重要的过程（例如，通过值，向量，齐次聚集等传递结构），并且该过程在某种意义上可能对原始源具有“破坏性”，并且映射不是一对一的。 You cannot "switch off" this process, as this is correctness thing. 您不能“关闭”此过程，因为这是正确的事情。

LLVM IR中函数的参数编号与源代码不同

问题描述

1 个解决方案

解决方案1
0 2019-07-29 22:48:28

LLVM IR中函数的参数编号与源代码不同

问题描述

1 个解决方案

解决方案1 0 2019-07-29 22:48:28

解决方案1
0 2019-07-29 22:48:28