简体   繁体   中英

Am I guaranteed to not encounter non-64-bit instructions if there are no compatibility mode switches in x86-64?

I know that a 64-bit program could theoretically switch to 32-bit mode by changing the CS as explained here , and I assume that applies to switching to 16-bit mode as well.

  1. If I run a 64-bit program that I know has no compatibility switches, am I guaranteed to not to run into non-64-bit instruction?

  2. I know that the 66 and 67 hex prefixes can switch an instruction between 16 and 32-bit mode (pg 36) , but those prefixes would not show up in 64-bit mode, correct?

  3. If I'm wrong, what are non-64 bit instructions that I could encounter in a 64-bit execution?

My goal would be to write an x86-64 decoder and I want to know if only handling 64-bit instruction cases is sufficient for my use-case (64-bit programs).

Every sequence of bytes of machine code either decodes as instructions or raises a #UD illegal-instruction exception. With the CPU in 64-bit mode, that means they're decoded as 64-bit mode instructions if they don't fault. See also Is x86 32-bit assembly code valid x86 64-bit assembly code? (no, not in general).

If it's a normal program emitted by a compiler, it's unlikely there are any illegal instructions in its machine code, unless someone used inline asm, or used your program to disassemble a non-code section. Or an obfuscated program that puts partial instructions ahead of actual jump target, so simple disassemblers get confused and decode with instruction boundaries different from how it will actually run. x86 machine code is a byte stream that is not self-synchronizing.

TL:DR: in a normal program, yes, every sequence of bytes you encounter when disassembling is valid 64-bit-mode instructions.


66 and 67 do not switch modes , they merely switch the operand size for that one instruction. eg 66 40 90 is still a REX prefix in 64-bit mode (for the NOP instruction that follows). So it's just a nop ( xchg ax,ax ), not overriding it to decode as it would in 32-bit mode as inc ax / xchg eax,eax .

Try assembling and then disassembling db 0x66, 0x40, 0x90 with nasm -felf32 then with nasm -felf64 to see how that same sequence decodes in 64-bit mode, not like it would in 32-bit mode .

Many instruction encodings are the same in both 32 and 64-bit mode, since they share the same default operand-size (for non-stack instructions). eg b8 39 30 00 00 mov eax,0x3039 is the code for mov eax, 12345 in either 32 or 64-bit mode.

(When you say "64-bit instruction", I hope you don't mean 64-bit operand-size , because that's not the case. All operand-sizes from 8 to 64-bit are encodeable in 64-bit mode for most instructions.)


And yes, it's safe to assume that user-space programs don't switch modes by doing a far jmp . Unless you're on Windows, then the WOW64 DLLs do that for some reason instead of directly calling into the kernel. (Linux has 32-bit user-space use sysenter or other direct system call).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM