简体   繁体   English

解析和修改LLVM IR代码

[英]Parsing and Modifying LLVM IR code

I want to read (parse) LLVM IR code (which is saved in a text file) and add some of my own code to it. 我想读取(解析)LLVM IR代码(保存在文本文件中)并添加一些我自己的代码。 I need some example of doing this, that is, how this is done by using the libraries provided by LLVM for this purpose. 我需要一些这样做的例子,也就是说,如何通过使用LLVM提供的库来实现这一目的。 So basically what I want is to read in the IR code from a text file into the memory (perhaps the LLVM library represents it in AST form, I dont know), make modifications, like adding some more nodes in the AST and then finally write back the AST in the IR text file. 基本上我想要的是将IR代码从文本文件读入内存(也许LLVM库以AST形式表示它,我不知道),进行修改,比如在AST中添加更多节点然后最后写在IR文本文件中备份AST。

Although I need to both read and modify the IR code, I would greatly appreciate if someone could provide or refer me to some example which just read (parses) it. 虽然我需要阅读和修改IR代码,但如果有人能够提供或推荐我刚刚读取(解析)它的一些示例,我将非常感激。

First, to fix an obvious misunderstanding: LLVM is a framework for manipulating code in IR format. 首先,要解决一个明显的误解:LLVM是一个用于处理IR格式代码的框架。 There are no ASTs in sight (*) - you read IR, transform/manipulate/analyze it, and you write IR back. 看不到AST(*) - 您读取IR,转换/操作/分析它,然后写回IR。

Reading IR is really simple: 阅读IR非常简单:

int main(int argc, char** argv)
{
    if (argc < 2) {
        errs() << "Expected an argument - IR file name\n";
        exit(1);
    }

    LLVMContext &Context = getGlobalContext();
    SMDiagnostic Err;
    Module *Mod = ParseIRFile(argv[1], Err, Context);

    if (!Mod) {
        Err.print(argv[0], errs());
        return 1;
    }

    [...]
  }

This code accepts a file name. 此代码接受文件名。 This should be an LLVM IR file (textual). 这应该是LLVM IR文件(文本)。 It then goes on to parse it into a Module , which represents a module of IR in LLVM's internal in-memory format. 然后它继续将其解析为一个Module ,该模块代表LLVM内部内存格式的IR模块。 This can then be manipulated with the various passes LLVM has or you add on your own. 然后可以使用LLVM具有的各种传递来操作它,或者您自己添加。 Take a look at some examples in the LLVM code base (such as lib/Transforms/Hello/Hello.cpp ) and read this - http://llvm.org/docs/WritingAnLLVMPass.html . 查看LLVM代码库中的一些示例(例如lib/Transforms/Hello/Hello.cpp )并阅读此内容 - http://llvm.org/docs/WritingAnLLVMPass.html

Spitting IR back into a file is even easier. 将IR吐回文件更加容易。 The Module class just writes itself to a stream: Module类只是将自己写入流:

 some_stream << *Mod;

That's it. 而已。

Now, if you have any specific questions about specific modifications you want to do to IR code, you should really ask something more focused. 现在,如果您对要对IR代码进行的具体修改有任何具体问题,那么您应该提出更具针对性的问题。 I hope this answer shows you how to parse IR and write it back. 我希望这个答案向您展示如何解析IR并将其写回。


(*) IR doesn't have an AST representation inside LLVM, because it's a simple assembly-like language. (*) IR在LLVM中没有AST表示,因为它是一种简单的汇编语言。 If you go one step up, to C or C++, you can use Clang to parse that into ASTs, and then do manipulations at the AST level. 如果你向上一步,使用C或C ++,你可以使用Clang将其解析为AST,然后在AST级别进行操作。 Clang then knows how to produce LLVM IR from its AST. 然后Clang知道如何从AST生成LLVM IR。 However, you do have to start with C/C++ here, and not LLVM IR. 但是,您必须从这里开始使用C / C ++,而不是LLVM IR。 If LLVM IR is all you care about, forget about ASTs. 如果你关心的是LLVM IR,那就忘了AST。

This is usually done by implementing an LLVM pass/transform. 这通常通过实现LLVM传递/转换来完成。 This way you don't have to parse the IR at all because LLVM will do it for you and you will operate on a object-oriented in-memory representation of the IR. 这样你根本不需要解析IR,因为LLVM会为你做这件事,你将在面向对象的内存中表示IR。

This is the entry point for writing an LLVM pass. 是编写LLVM传递的入口点。 Then you can look at any of the already implemented standard passes that come bundled with LLVM (look into lib/Transforms ). 然后,您可以查看与LLVM捆绑在一起的任何已实现的标准传递(查看lib / Transforms )。

The easiest way to do this is to look at one of the existing tools and steal code from it. 最简单的方法是查看其中一个现有工具并从中窃取代码。 In this case, you might want to look at the source for llc. 在这种情况下,您可能希望查看llc的源代码。 It can take either a bitcode or .ll file as input. 它可以采用bitcode或.ll文件作为输入。 You can modify the input file any way you want and then write out the file using something similar to the code in llvm-dis if you want a text file. 您可以以任何方式修改输入文件,然后使用与llvm-dis中的代码类似的内容写出文件(如果需要文本文件)。

The Opt tool takes llvm IR code, runs a pass on it, and then spits out transformed llvm IR on the other side. Opt工具采用llvm IR代码,对其进行传递,然后在另一侧吐出转换后的llvm IR。

The easiest to start hacking is lib\\Transforms\\Hello\\Hello.cpp. 最容易开始的黑客攻击是lib \\ Transforms \\ Hello \\ Hello.cpp。 Hack it, run through opt with your source file as input, inspect output. 破解它,以源文件作为输入运行opt,检查输出。

Apart from that, the docs for writing passes is really quite good. 除此之外,写通行证的文件非常好。

As mentioned above the best way it to write a pass. 如上所述,写一个通行证的最佳方式。 But if you want to simply iterate through the instructions and do something with the LLVM provided an InstVisitor class. 但是如果你想简单地遍历指令并使用LLVM做一些提供InstVisitor类的东西。 It is a class that implements the visitor pattern for instructions. 它是一个实现指令的访问者模式的类。 It is very straight forward to user, so if you want to avoid learning how to implement a pass, you could resort to that. 这对用户来说非常简单,所以如果你想避免学习如何实现一个传递,你可以诉诸于此。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM