简体   繁体   中英

how to convert opcodes to hexcodes

I am really stumped over here, please help...

I have ordered a bunch of books from intel, the software developers manuals, and inside of them they give me all these opcodes like "VEX.128..." or "0F 5B"

and these books say things like "this works with XMM registers" but none of the books talk about how to convert the word XMM into hexadecimal?

What I am trying to do is like write hello, world in pure hexadecimal without the use of an assembler! please help! I hope this made sense I am new to the world of assembly and hexcodes

The word XMM is not converted to hexadecimal. That an instruction uses XMM registers is a property of the opcode and prefixes. The index of the register operands are mostly encoded by the ModRM byte, a little in the prefix, for some operations on a GPR a register name is encoded in the opcode byte.

Complexities aside, here is a simple VEX-prefixed example, vpaddb xmm1, xmm4, xmm6 . Its entry in the manual (under the paddb lemma) says to encode it as: VEX.NDS.128.66.0F.WIG FC /r

VEX.NDS.128.66.0F.WIG is for the VEX prefix. NDS means that the vvvv field encodes the source register. 128 means not to set the L bit, which makes the registers used the X MM versions, otherwise they would be the Y MM versions (so as you see this distinction is encoded by a single bit in the VEX prefix, not by writing the word "XMM" in hexadecimal exactly) 66 indicates a setting for the pp field that corresponds to a mandatory prefix in the legacy encoding, namely pp = 01. WIG = W-ignored, which doesn't really matter here.

Anyway it can be a 2-byte VEX prefix (no fancy opcode map, low register numbers), so start with C5 and then combine the fields ~R|~vvvv|L|pp (where | is concatenation). ~vvvv is the complement of vvvv, vvvv = 0100 (xmm 4 ). The R field is an extension of the reg field of the ModRM byte, xmm1 has an index lower than 8 so the R field is 0, hence ~R is 1. Combined, that second prefix byte is 1|1011|0|01 = D9 .

The opcode byte is FC , nothing funny happens here.

/r means to encode the rest of the operands as ModRM (+SIB), so here using mod=11 (two registers, no memory operand), rm = 110 (xmm 6 ), reg = 001 (xmm 1 ) so 11001110 = CE

So in total vpaddb xmm1,xmm4,xmm6 becomes c5 d9 fc ce .

You can find this information (and some details which I skipped) in Appendix B of the ISDM, "Instruction Formats and Encodings".

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM