简体   繁体   English

编译汇编程序操作码

[英]Compiling Assembler Opcodes

I was wondering if it were possible to replace assembler instructions by their equivalent opcodes. 我想知道是否有可能用等效的操作码代替汇编程序指令。 (ie. being able to compile Opcodes rather than instructions) If so, is it be possible to manipulate these opcodes at runtime? (即能够编译操作码而不是指令)如果可以,是否可以在运行时操作这些操作码? Cheers 干杯

if it were possible to replace assembler instructions by their equivalent opcodes. 如果有可能用等效的操作码替换汇编程序指令。

Yes, you can compile opcodes, the resulting machine code will be identical. 是的,您可以编译操作码,生成的机器代码将相同。

For example x86-32 short useless assembly code: 例如x86-32简短的无用汇​​编代码:

uselessFunc:
    xor  eax,eax
    ret

Can be written with opcodes too: 也可以用操作码编写:

uselessFunc:
    db  0x31, 0xC0    ; opcode "xor eax,eax"
    db  0xC3          ; opcode "ret"

Both sources would produce identical three bytes of machine code: 31 C0 C3 . 这两个源都将产生相同的三个字节的机器代码: 31 C0 C3

is it be possible to manipulate these opcodes at runtime 是否可以在运行时操作这些操作码

That's completely unrelated to the form of source. 这与来源的形式完全无关。 At runtime you can manipulate any memory, to which you have write access (ideally read+write access). 在运行时,您可以操作任何具有写访问权限(理想情况下为读+写访问)的内存。 But after you would modify the opcodes, if you want to run them, you need also execute access to that memory. 但是,在修改操作码后,如果要运行它们,则还需要执行对该内存的访问。

On modern x86 machine with modern OS like linux this is not default configuration, by default the code segment is read-only + executable, and data segment is read+write, but not executable, so if you would try to modify opcodes of your code, you would crash on invalid memory access during write, and if you would try to execute opcodes in data segment, you would trigger no-exec fault. 在具有现代操作系统(如linux)的现代x86机器上,这不是默认配置,默认情况下,代码段为只读+可执行文件,而数据段为读写+可执行文件,但不是可执行文件,因此,如果您尝试修改代码的操作码,则会在写入期间因无效的内存访问而崩溃,如果尝试在数据段中执行操作码,则会触发no-exec错误。

So applications like Java VM and similar, which are producing code at runtime, and then executing it (the "JIT" just-in-time compiler compiles the java opcodes from .class files at runtime into native machine code to get better performance for parts which are executed repeatedly) do not only produce/modify opcodes, but also manage target memory pages with other system calls to make them first writeable, and then change them to no-read+exec code memory pages. 因此,像Java VM等类似的应用程序会在运行时生成代码,然后执行(“ JIT”即时编译器在运行时将.class文件中的Java操作码编译为本地机器代码,以提高零件的性能)重复执行的代码)不仅会产生/修改操作码,而且还会通过其他系统调用来管理目标存储器页面,以使其首先可写入,然后将其更改为“未读+执行”代码存储器页面。 Ie usually it's possible, but on many target environments you have to use additional system services to make it work correctly. 即通常是可能的,但是在许多目标环境中,您必须使用其他系统服务才能使其正常工作。

Bear in mind self-modify code is considered bad practice in modern era, not only as it's more difficult to debug, but if used in naive way, it may have huge performance implications (as again for example on x86 CPUs modifying the opcodes only few bytes ahead of execution will invalidate all kind of possible caches/prefetch lines in CPU, making it stall for short while to re-read/decode the instructions). 请记住,在现代时代,自我修改代码被认为是不好的做法,不仅因为它更难调试,而且如果以幼稚的方式使用,它可能会带来巨大的性能影响(例如,在x86 CPU上修改操作码的例子很少执行之前的字节将使CPU中所有可能的高速缓存/预取行无效,从而使其在重新读取/解码指令时停顿一会儿。 And on some CPUs the memory/cache model is weaker than on x86, so modifying the opcodes too late may be ignored by CPU, as it already did decode the old content and will execute that. 而且在某些CPU上,内存/缓存模型比x86弱,因此过晚修改操作码可能会被CPU忽略,因为它已经解码了旧内容并将执行该内容。

But as long as you know what you are doing, producing/modifying opcodes is possible. 但是只要您知道自己在做什么,就可以生成/修改操作码。 It just doesn't depend in any way on the form of your source, doesn't matter how you produced the original binary, whether you wrote those opcodes with assembly or C language source, or written them in hexa editor as byte values directly. 它只是完全不依赖于源代码的形式,无论如何生成原始二进制文件,无论是使用汇编语言还是C语言源代码编写这些操作码,还是直接在hexa编辑器中将它们编写为字节值都无所谓。

With those two examples above, in both cases you can do: 通过上面的两个示例,在两种情况下都可以执行以下操作:

mov   byte [uselessFunc+1],0xD8 ; modify xor eax,eax to xor eax,ebx

If you will get write access to the target area of memory, and it will keep executable rights, then this will turn xor eax,eax into xor eax,ebx in both cases. 如果您将获得对目标内存区域的写访问权,并且将保留可执行权限,则在两种情况下,这会将xor eax,eax转换为xor eax,ebx

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM