简体   繁体   中英

Compiling Assembler Opcodes

I was wondering if it were possible to replace assembler instructions by their equivalent opcodes. (ie. being able to compile Opcodes rather than instructions) If so, is it be possible to manipulate these opcodes at runtime? Cheers

if it were possible to replace assembler instructions by their equivalent opcodes.

Yes, you can compile opcodes, the resulting machine code will be identical.

For example x86-32 short useless assembly code:

uselessFunc:
    xor  eax,eax
    ret

Can be written with opcodes too:

uselessFunc:
    db  0x31, 0xC0    ; opcode "xor eax,eax"
    db  0xC3          ; opcode "ret"

Both sources would produce identical three bytes of machine code: 31 C0 C3 .

is it be possible to manipulate these opcodes at runtime

That's completely unrelated to the form of source. At runtime you can manipulate any memory, to which you have write access (ideally read+write access). But after you would modify the opcodes, if you want to run them, you need also execute access to that memory.

On modern x86 machine with modern OS like linux this is not default configuration, by default the code segment is read-only + executable, and data segment is read+write, but not executable, so if you would try to modify opcodes of your code, you would crash on invalid memory access during write, and if you would try to execute opcodes in data segment, you would trigger no-exec fault.

So applications like Java VM and similar, which are producing code at runtime, and then executing it (the "JIT" just-in-time compiler compiles the java opcodes from .class files at runtime into native machine code to get better performance for parts which are executed repeatedly) do not only produce/modify opcodes, but also manage target memory pages with other system calls to make them first writeable, and then change them to no-read+exec code memory pages. Ie usually it's possible, but on many target environments you have to use additional system services to make it work correctly.

Bear in mind self-modify code is considered bad practice in modern era, not only as it's more difficult to debug, but if used in naive way, it may have huge performance implications (as again for example on x86 CPUs modifying the opcodes only few bytes ahead of execution will invalidate all kind of possible caches/prefetch lines in CPU, making it stall for short while to re-read/decode the instructions). And on some CPUs the memory/cache model is weaker than on x86, so modifying the opcodes too late may be ignored by CPU, as it already did decode the old content and will execute that.

But as long as you know what you are doing, producing/modifying opcodes is possible. It just doesn't depend in any way on the form of your source, doesn't matter how you produced the original binary, whether you wrote those opcodes with assembly or C language source, or written them in hexa editor as byte values directly.

With those two examples above, in both cases you can do:

mov   byte [uselessFunc+1],0xD8 ; modify xor eax,eax to xor eax,ebx

If you will get write access to the target area of memory, and it will keep executable rights, then this will turn xor eax,eax into xor eax,ebx in both cases.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM