简体   繁体   English

逆向工程c程序

[英]reverse engineering c programs

every c program is converted to machine code, if this binary is distributed. 如果分发了这个二进制文件,则每个c程序都转换为机器代码。 Since the instruction set of a computer is well known, is it possible to get back the C original program? 由于计算机的指令集众所周知,是否可以取回C原始程序?

You can never get back to the exact same source since there is no meta-data about that saved with the compiled code. 您永远无法回到完全相同的源,因为没有与编译代码一起保存的元数据。

But you can re-create code out from the assembly-code. 但是您可以从汇编代码中重新创建代码。

Check out this book if you are interested in these things: Reversing: Secrets of Reverse Engineering . 如果您对以下内容感兴趣,请查看本书: 逆转:逆向工程的秘密

Edit 编辑

Some compilers-101 here, if you were to define a compiler with another word and not as technical as "compiler", what would it be? 这里有一些编译器-101,如果你用另一个单词定义编译器而不是像“编译器”那样技术性,它会是什么?

Answer: Translator 答: 翻译

A compiler translates the syntax / phrases you have written into another language a C compiler translates to Assembly or even Machine-code. 编译器将您编写的语法/短语转换为C编译器转换为Assembly或甚至是机器代码的另一种语言。 C# Code is translated to IL and so forth. C#代码被翻译成IL等等。

The executable you have is just a translation of your original text / syntax and if you want to "reverse it" hence "translate it back" you will most likely not get the same structure as you had at the start. 你拥有的可执行文件只是原始文本/语法的翻译,如果你想“反转它”,因此“将其翻译回来”,你很可能得不到你在开始时的结构。

A more real life example would be if you Translate from English to German and the from German back to English, the sentance structure will most likely be different, other words might be used but the meaning, the context, will most likely not have changed. 一个更现实的例子是,如果你从英语翻译成德语,从德语翻译回英语,那么结构结构很可能会有所不同,可能会使用其他单词,但意义,背景,很可能不会改变。

The same goes for a compiler / translator if you go from C to ASM, the logic is the same, it's just a different way of reading it ( and of course its optimized ). 编译器/翻译器也是如此,如果你从C到ASM,逻辑是相同的,它只是一种不同的阅读方式(当然还有它的优化)。

It depends on what you mean by original C program. 这取决于您对原始C程序的意思。 Things like local variable names, comments, etc... are not included in the binary, so there's no way to get the exact same source code as the one used to produce the binary. 诸如局部变量名,注释等之类的东西不包含在二进制文件中,因此无法获得与用于生成二进制文件的源代码完全相同的源代码。 Tools such as IDA Pro might help you disassemble a binary. IDA Pro等工具可以帮助您反汇编二进制文件。

I would guestimate the conversion rate of a really skilled hacker at about 1 kilobyte of machine code per day. 我想知道一个真正熟练的黑客每天大约1千字节机器代码的转换率。 At common Western salaries, that puts the price of, say, a 100 KB executable at about $25,000. 在西方普通工资中,例如,100 KB可执行文件的价格约为25,000美元。 After spending that much money, all that's gained is a chunk of C code that does exactly what yours does, minus the benefit of comments and whatnot. 花了那么多钱之后,获得的所有东西都是一大块C代码,完全与你的代码完全相同,减去了评论和诸如此类的好处。 It is no way competitive with your version, you'll be able to deliver updates and improvements much quicker. 它无法与您的版本竞争,您将能够更快地提供更新和改进。 Reverse engineering those updates is a non trivial effort as well. 逆向工程这些更新也是一项非常重要的工作。

If that price tag doesn't impress you, you can arbitrarily raise the conversion cost by adding more code. 如果该价格标签没有给您留下深刻印象,您可以通过添加更多代码随意提高转换成本。 Just keep in mind that skilled hackers that can tackle large programs like this have something much better to do. 请记住,能够处理像这样的大型程序的熟练黑客有更好的事情要做。 They write their own code. 他们编写自己的代码。

One of the best works on this topic that I know about is: 我所知道的关于这个主题的最佳作品之一是:

Pigs from sausages? 香肠猪? Reengineering from assembler to C via FermaT . 通过FermaT从汇编程序重新编译为C.

The claim is you get back a reasonable C program, even if the original asm code was not written in C! 声称你得到一个合理的C程序,即使原始的asm代码不是用C语言编写的! Lots of caveats apply. 有很多警告适用。

Working on tools that do this is a research activity. 使用这样做的工具是一项研究活动。 That is, it is possible to get something in the easy cases (you won't recover local variables names unless debug symbols are present, for instance). 也就是说,可以在简单的情况下获得一些东西(除非存在调试符号,否则不会恢复局部变量名称)。 It's nearly impossible in practice for large programs or if the programmer had decided to make it difficult. 在大型程序的实践中,或者如果程序员决定让它变得困难,这几乎是不可能的。

The Hex-Rays decompiler (extension to IDA Pro) can do exactly that. Hex-Rays反编译器(IDA Pro的扩展)可以做到这一点。 It's still fairly recent and upcoming but showing great promise. 它仍然是相当近期和即将到来但显示出巨大的希望。 It takes a little getting used to but can potentially speed up the reversing process. 它需要一点点习惯,但可能会加快逆转过程。 It's not a "silver bullet" - no c decompiler is, but it's a great asset. 它不是一个“银弹” - 没有反编译器,但它是一个很好的资产。

The common name for this procedure is "turning hamburger back into cows." 这个程序的通用名称是“将汉堡包变回奶牛”。 It's possible to reverse engineer binary code into a functionally equivalent C program, but whether that C code bears a close resemblance to the original is an open question. 可以将二进制代码反向工程为功能等效的C程序,但是C代码是否与原始程序非常相似是一个悬而未决的问题。

There is not a 1:1 mapping between a C program and the ASM/machine code it will produce - one C program can compile to a different result on different compilers or with different settings) and sometimes two different bits of C could produce the same machine code. 在C程序和它将产生的ASM /机器代码之间没有1:1的映射 - 一个C程序可以在不同的编译器或不同的设置上编译成不同的结果)有时两个不同的C位可以产生相同的机器代码。

You definitely can generate C code from a compiled EXE. 您肯定可以从已编译的EXE生成C代码。 You just can't know how similar in structure it will be to the original code - apart from variable/function names being lost, I assume it won't know the original way the code was split amongst many files. 你只是不知道它与原始代码的结构有多么相似 - 除了变量/函数名称丢失之外,我认为它不会知道代码在许多文件中分割的原始方式。

您可以尝试使用hex-rays.com,它有一个非常好的反编译器,可以将汇编代码反编译成C,准确率为99%。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM