[英]How to map multiple same type loops under a function to the generated basic block in LLVM IR?
If the loops are of the different type then I can easily identify them with the name but if there are multiple same type loops (say 5 while
loops), how can I identify what basic block in the LLVM IR corresponds to which loop in the source code? 如果循环的类型不同,那么我可以轻松地用名称来标识它们,但是如果有多个相同类型的循环(例如5个
while
循环),那么如何识别LLVM IR中的哪个基本块对应于源代码中的哪个循环码?
Manually it is easy to identify as we visit the code and the LLVM IR sequentially but I am looking how we can identify the same programmatically. 手动识别很容易,因为我们可以顺序访问代码和LLVM IR,但是我正在寻找如何以编程方式识别相同的对象。
Example, I have the below source code in C: 例如,我在C中具有以下源代码:
int main()
{
int count=1;
while (count <= 4)
{
count++;
}
while (count > 4)
{
count--;
}
return 0;
}
when I execute the comand clang -S -emit-llvm fileName.c
I got fileName.ll create with the below content: 当我执行
clang -S -emit-llvm fileName.c
我得到了具有以下内容的fileName.ll创建:
; ModuleID = 'abc.c'
source_filename = "abc.c"
target datalayout = "e-m:w-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-windows-msvc19.0.23026"
; Function Attrs: noinline nounwind uwtable
define i32 @main() #0 {
entry:
%retval = alloca i32, align 4
%count = alloca i32, align 4
store i32 0, i32* %retval, align 4
store i32 1, i32* %count, align 4
br label %while.cond
while.cond: ; preds = %while.body, %entry
%0 = load i32, i32* %count, align 4
%cmp = icmp sle i32 %0, 4
br i1 %cmp, label %while.body, label %while.end
while.body: ; preds = %while.cond
%1 = load i32, i32* %count, align 4
%inc = add nsw i32 %1, 1
store i32 %inc, i32* %count, align 4
br label %while.cond
while.end: ; preds = %while.cond
br label %while.cond1
while.cond1: ; preds = %while.body3, %while.end
%2 = load i32, i32* %count, align 4
%cmp2 = icmp sgt i32 %2, 4
br i1 %cmp2, label %while.body3, label %while.end4
while.body3: ; preds = %while.cond1
%3 = load i32, i32* %count, align 4
%dec = add nsw i32 %3, -1
store i32 %dec, i32* %count, align 4
br label %while.cond1
while.end4: ; preds = %while.cond1
ret i32 0
}
attributes #0 = { noinline nounwind uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.module.flags = !{!0}
!llvm.ident = !{!1}
!0 = !{i32 1, !"PIC Level", i32 2}
!1 = !{!"clang version 4.0.0 (tags/RELEASE_400/final)"}
Now there are two basic blocks created for the given source file as while.cond
and while.cond1
, how can I identify which basic block is for which while loop in the source code? 现在,为给定的源文件创建了两个基本块,如
while.cond
和while.cond1
,我如何确定源代码中哪个while循环对应哪个基本块?
Before I attempt to answer, I just want to note that depending on the selected optimization level or the manually selected pass with opt
that information might not be there or might not be as accurate (eg because of inlining, cloning, etc). 在我试图回答,我只是想指出,根据所选择的优化级别或手动选择通与
opt
的信息可能不存在或可能不准确(例如,由于内联,克隆等)。
Now, the way to associate between low-level representations and source code is using debugging information (eg with the DWARF format). 现在,在低级表示形式和源代码之间进行关联的方法是使用调试信息(例如DWARF格式)。 To produce debugging information you need to use the
-g
command-line flag during compilation. 要生成调试信息,您需要在编译期间使用
-g
命令行标志。
For LLVM IR, if you take a look at the Loop
API there are relevant calls like getStartLoc
. 对于LLVM IR,如果您查看
Loop
API,则会有相关的调用,例如getStartLoc
。 So you could do something like this (eg inside the runOn
method of a llvm::Function
pass): 因此,您可以执行以下操作(例如,在
llvm::Function
传递的runOn
方法内部):
llvm::SmallVector<llvm::Loop *> workList;
auto &LI = getAnalysis<llvm::LoopInfoWrapperPass>(CurFunc).getLoopInfo();
std::for_each(LI.begin(), LI.end(), [&workList](llvm::Loop *e) { workList.push_back(e); });
for(auto *e : workList) {
auto line = e->getStartLoc().getLine();
auto *scope = llvm::dyn_cast<llvm::DIScope>(e->getStartLoc().getScope());
auto filename = scope->getFilename();
// do stuff here
}
Moreover, for BasicBlock
, you can also use the debug-related methods in Instruction
(eg getDebugLoc
) and combine it with calls to other Loop
's methods such as getHeader
, etc. 此外,对于
BasicBlock
,您还可以在Instruction
使用与调试相关的方法(例如getDebugLoc
),并将其与对其他Loop
方法的调用(例如getHeader
等)组合在一起。
Also, note that there is a getLoopID
method that uses an internal unique ID for each loop, but that is not always there and it's subject to the potential elisions I mentioned at the start. 另外,请注意,有一个
getLoopID
方法为每个循环使用内部唯一ID,但这并不总是存在的,并且受我一开始提到的可能省略的限制。 Anyhow, if you need to manipulate it, look at examples in LLVM source following the setLoopID
method (eg in lib/Transforms/Scalar/LoopRotation.cpp
). 无论如何,如果需要操纵它,请遵循
setLoopID
方法(例如,在lib/Transforms/Scalar/LoopRotation.cpp
)查看LLVM源代码中的示例。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.