简体   繁体   English

在机器代码中引用寄存器

[英]Referencing registers in machine code

I am looking at some assembly code and the corresponding memory dump and I am having trouble understanding what is going on. 我正在查看一些汇编代码和相应的内存转储,我无法理解正在发生的事情。 I'm using this as reference for opcodes for x86 and this as reference for registers in x86 . 我正在使用它作为x86操作码的参考这是x86中寄存器的参考 I ran into these commands and I realized I am still missing a big piece of the puzzle. 我遇到了这些命令,我​​意识到我仍然错过了一大块难题。

8B 45 F8       - mov eax,[ebp-08] 
8B 80 78040000 - mov eax,[eax+00000478]
8B 00          - mov eax,[eax]

Basically I don't understand what the two bytes after the opcode mean and I can't find anywhere that gives a bit-by-bit format for the commands (if anyone could point me to one it would be much appreciated). 基本上我不明白操作码之后的两个字节意味着什么,我找不到任何能为命令提供逐位格式的地方 (如果有人能指出我会非常感激)。

How does the CPU know how long each of these commands are? CPU如何知道每个命令有多长?

According to my reference this 8B mov command allows the use of the 32b or 16b registers, meaning there are 16 possible registers (AX, CX, DX, BX, SP, BP, SI, DI, and their extended equivalents). 根据我的参考,这个8B mov命令允许使用32b或16b寄存器,这意味着有16个可能的寄存器(AX,CX,DX,BX,SP,BP,SI,DI及其扩展等价物)。 That means you need a whole byte to specify which register to use in each operand. 这意味着您需要一个完整的字节来指定在每个操作数中使用哪个寄存器。

Still fine so far, the two bytes after the opcode could specify which registers to use. 到目前为止仍然很好,操作码后面的两个字节可以指定使用哪些寄存器。 Then I noticed that these commands are stacked byte to byte in the memory and all three of them use a different amount of bytes to specify the offset to be used when dereferencing the second operand. 然后我注意到这些命令在内存中逐字节堆叠,并且所有这三个命令使用不同的字节数来指定解除引用第二个操作数时要使用的偏移量。

I suppose you could limit the registers to only be able to use 16b with 16b and 32b with 32b, but that would only free up a single bit, not enough to tell the CPU how many bytes the offset is. 我想你可以限制寄存器只能使用带有16b的16b和带32b的32b,但这只能释放一个位,不足以告诉CPU有多少字节的偏移量。

What values correspond to which registers? 哪些值对应哪些寄存器?

The second thing that bothers me is that though my reference explicitly numbers the registers I do not see any correlation with the bytes after the opcode in these commands. 困扰我的第二件事是,尽管我的引用明确地给寄存器编号,但是在这些命令中操作码之后没有看到与字节的任何相关性。 These commands don't seem to be consistent even with themselves. 即使是他们自己,这些命令似乎也不一致。 The second and third commands are both going from eax to eax, but there is a bit midway through the first byte that is different. 第二个和第三个命令都是从eax到eax,但在第一个字节中间有一点不同。

Following my reference I would assume 0 is EAX, 1 is ECX, 2 is EDX, and so on. 根据我的参考,我假设0是EAX,1是ECX,2是EDX,依此类推。 This doesn't, however, offer me any insight into how you would specify between RAX, EAX, AX, AL, and AH. 但是,这并不能让我深入了解如何在RAX,EAX,AX,AL和AH之间进行指定。 Some of the commands seem to only accept 8b registers, while others take 16b or 32b, and on x86_64 some seem to take 16b, 32b, or 64b registers. 一些命令似乎只接受8b寄存器,而其他命令接受16b或32b,而在x86_64上,一些命令似乎接受16b,32b或64b寄存器。 So would you just do something like 0-7 are the R's, 8-15 the E's, 16-23 non-extended, and 24-31 the H's and L's? 所以你会做一些类似0-7的事情是R,8-15 E,16-23非延伸,24-31 H和L? Even if it is something like that it seems like it should be a lot easier to find a manual or something specifying that. 即使它是这样的,似乎应该更容易找到手册或指定的东西。

The first byte after the opcode is the ModR/M byte. 操作码之后的第一个字节是ModR / M字节。 The first reference you linked contains tables for the ModR/M byte toward the end of the page. 您链接的第一个引用包含朝向页面末尾的ModR / M字节的表。 For a memory access instruction such as these, the ModR/M byte indicates the register being loaded or stored and the addressing mode to use for the memory access. 对于诸如这些的存储器访问指令,ModR / M字节指示正在加载或存储的寄存器以及用于存储器访问的寻址模式。

The byte(s) that follow the ModR/M byte are dependent on the value of the ModR/M byte. ModR / M字节后面的字节取决于ModR / M字节的值。

In the instruction "mov eax, [ebp-8]", the ModR/M byte is 45. From the table for 32-bit ModR/M Byte, this means Reg is eax and Effective Address is [EBP]+disp8. 在指令“mov eax,[ebp-8]”中,ModR / M字节为45.从32位ModR / M字节的表中,这意味着Reg是eax,有效地址是[EBP] + disp8。 The next byte of the instruction, F8, is the 8-bit signed offset. 指令的下一个字节F8是8位有符号偏移量。

The operand size of the instruction can be implicit in the instruction or it can be specified by an instruction prefix. 指令的操作数大小可以隐含在指令中,也可以由指令前缀指定。 For example, the 66 prefix would indicate 16-bit operands, for a mov instruction such as those in your examples. 例如,66前缀表示16位操作数,对于mov指令,例如示例中的那些。 The 48 prefix would indicate 64-bit operands, if you're using 64-bit mode. 如果您使用64位模式,则48前缀将指示64位操作数。

8-bit operands are usually indicated by the low bit of the instruction. 8位操作数通常由指令的低位指示。 If you change the instruction in your example from 8B to 8A, it becomes an 8-bit move into al. 如果将示例中的指令从8B更改为8A,则它将变为8位移动到al。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM