简体   繁体   English

如何将一个函数下的多个相同类型的循环映射到LLVM IR中生成的基本块?

[英]How to map multiple same type loops under a function to the generated basic block in LLVM IR?

If the loops are of the different type then I can easily identify them with the name but if there are multiple same type loops (say 5 while loops), how can I identify what basic block in the LLVM IR corresponds to which loop in the source code? 如果循环的类型不同,那么我可以轻松地用名称来标识它们,但是如果有多个相同类型的循环(例如5个while循环),那么如何识别LLVM IR中的哪个基本块对应于源代码中的哪个循环码?

Manually it is easy to identify as we visit the code and the LLVM IR sequentially but I am looking how we can identify the same programmatically. 手动识别很容易,因为我们可以顺序访问代码和LLVM IR,但是我正在寻找如何以编程方式识别相同的对象。

Example, I have the below source code in C: 例如,我在C中具有以下源代码:

int main()
{
   int count=1;
   while (count <= 4)
   {
        count++;
   }
   while (count > 4)
   {
        count--;
   }
   return 0;
}

when I execute the comand clang -S -emit-llvm fileName.c I got fileName.ll create with the below content: 当我执行clang -S -emit-llvm fileName.c我得到了具有以下内容的fileName.ll创建:

; ModuleID = 'abc.c'
source_filename = "abc.c"
target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-windows-msvc19.0.23026"

; Function Attrs: noinline nounwind uwtable
define i32 @main() #0 {
entry:
  %retval = alloca i32, align 4
  %count = alloca i32, align 4
  store i32 0, i32* %retval, align 4
  store i32 1, i32* %count, align 4
  br label %while.cond

while.cond:                                       ; preds = %while.body, %entry
  %0 = load i32, i32* %count, align 4
  %cmp = icmp sle i32 %0, 4
  br i1 %cmp, label %while.body, label %while.end

while.body:                                       ; preds = %while.cond
  %1 = load i32, i32* %count, align 4
  %inc = add nsw i32 %1, 1
  store i32 %inc, i32* %count, align 4
  br label %while.cond

while.end:                                        ; preds = %while.cond
  br label %while.cond1

while.cond1:                                      ; preds = %while.body3, %while.end
  %2 = load i32, i32* %count, align 4
  %cmp2 = icmp sgt i32 %2, 4
  br i1 %cmp2, label %while.body3, label %while.end4

while.body3:                                      ; preds = %while.cond1
  %3 = load i32, i32* %count, align 4
  %dec = add nsw i32 %3, -1
  store i32 %dec, i32* %count, align 4
  br label %while.cond1

while.end4:                                       ; preds = %while.cond1
  ret i32 0
}

attributes #0 = { noinline nounwind uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.module.flags = !{!0}
!llvm.ident = !{!1}

!0 = !{i32 1, !"PIC Level", i32 2}
!1 = !{!"clang version 4.0.0 (tags/RELEASE_400/final)"}

Now there are two basic blocks created for the given source file as while.cond and while.cond1 , how can I identify which basic block is for which while loop in the source code? 现在,为给定的源文件创建了两个基本块,如while.condwhile.cond1 ,我如何确定源代码中哪个while循环对应哪个基本块?

Before I attempt to answer, I just want to note that depending on the selected optimization level or the manually selected pass with opt that information might not be there or might not be as accurate (eg because of inlining, cloning, etc). 在我试图回答,我只是想指出,根据所选择的优化级别或手动选择通与opt的信息可能不存在或可能不准确(例如,由于内联,克隆等)。

Now, the way to associate between low-level representations and source code is using debugging information (eg with the DWARF format). 现在,在低级表示形式和源代码之间进行关联的方法是使用调试信息(例如DWARF格式)。 To produce debugging information you need to use the -g command-line flag during compilation. 要生成调试信息,您需要在编译期间使用-g命令行标志。

For LLVM IR, if you take a look at the Loop API there are relevant calls like getStartLoc . 对于LLVM IR,如果您查看Loop API,则会有相关的调用,例如getStartLoc So you could do something like this (eg inside the runOn method of a llvm::Function pass): 因此,您可以执行以下操作(例如,在llvm::Function传递的runOn方法内部):

llvm::SmallVector<llvm::Loop *> workList;
auto &LI = getAnalysis<llvm::LoopInfoWrapperPass>(CurFunc).getLoopInfo();

std::for_each(LI.begin(), LI.end(), [&workList](llvm::Loop *e) { workList.push_back(e); });

for(auto *e : workList) {
  auto line = e->getStartLoc().getLine();
  auto *scope = llvm::dyn_cast<llvm::DIScope>(e->getStartLoc().getScope());
  auto filename = scope->getFilename();

  // do stuff here
}

Moreover, for BasicBlock , you can also use the debug-related methods in Instruction (eg getDebugLoc ) and combine it with calls to other Loop 's methods such as getHeader , etc. 此外,对于BasicBlock ,您还可以在Instruction使用与调试相关的方法(例如getDebugLoc ),并将其与对其他Loop方法的调用(例如getHeader等)组合在一起。

Also, note that there is a getLoopID method that uses an internal unique ID for each loop, but that is not always there and it's subject to the potential elisions I mentioned at the start. 另外,请注意,有一个getLoopID方法为每个循环使用内部唯一ID,但这并不总是存在的,并且受我一开始提到的可能省略的限制。 Anyhow, if you need to manipulate it, look at examples in LLVM source following the setLoopID method (eg in lib/Transforms/Scalar/LoopRotation.cpp ). 无论如何,如果需要操纵它,请遵循setLoopID方法(例如,在lib/Transforms/Scalar/LoopRotation.cpp )查看LLVM源代码中的示例。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM