简体   繁体   中英

Assembly - jump instruction in machine code

姬

Why the jump instruction in line 1B (for example) become EBBD?

I know that "jmp" = EB But how BD is calculated?

A short jump uses a signed offset added to the address of the instruction following the JMP .

For example, the first JMP L2 has an offset of FE which equates to -2 , and adding that to the address of the instruction following that JMP gives you the address of that JMP .

However, that's not the case for the first JMP L since the offset needed there would be E8 (the second JMP L is also incorrect, it should have the offset E6 ). You can confirm this if you hop on to an online x86 assembler site such as this one and enter:

    mov ecx,2
l:  mov edx,0
    inc edx
    sub ecx,1
    nop
    nop
    nop
    setz al
    shl al,1
    mov byte [l1+1],al
l1: jmp l
    jmp l
    mov byte [l2+1],al
l2: jmp l
    jmp l2
    mov eax,edx
    ret

You'll notice those extra three NOP lines, they're because the assembler chooses a shorter variant of SUB ECX,1 and I just want to keep the addresses lined up with what you have. The assembled code out of that is as follows:

0:  b9 02 00 00 00          mov    ecx,0x2
00000005 <l>:
5:  ba 00 00 00 00          mov    edx,0x0
a:  42                      inc    edx
b:  83 e9 01                sub    ecx,0x1
e:  90                      nop
f:  90                      nop
10: 90                      nop
11: 0f 94 c0                sete   al
14: d0 e0                   shl    al,1
16: a2 1c 00 00 00          mov    ds:0x1c,al
0000001b <l1>:
1b: eb e8                   jmp    5 <l>
1d: eb e6                   jmp    5 <l>
1f: a2 25 00 00 00          mov    ds:0x25,al
00000024 <l2>:
24: eb fe                   jmp    24 <l2>
26: eb fc                   jmp    24 <l2>
28: 89 d0                   mov    eax,edx
2a: c3                      ret

It's evident from that that the encoding of the first two jumps is incorrect in your posted code. They should be EbE8/EbE6 rather than EBBD/EBEB . In fact, the latter pair wouldn't even make sense if they were going to some other label, since the difference between them should be exactly two if they're jumping to the same label.


One thing to be wary of however: if you examine the code closely, you'll see that it's actually self modifying, as the JMP instructions are modified with statements like:

MOV BYTE [L1 + 1],  AL

(changing the offset of the instruction at L1 ). Self-modifying code can be used for the purposes of obfuscation, or making it difficult to reverse-engineer software, and it may be that the code has already undergone the changes that will be applied.

It would be useful to watch that code dynamically as the self-changes are made, to see how they affect the code, but the rough outcome of my static analysis follows:

Address  Effect
-------  ------
  00     ecx = 2
  05     edx = 0
  0a     edx = 1
  0b     ecx = 1, zflag=F
  11     al = 0 (because zflag=F)
  14     al stays 0
  16     instruction at 1b becomes eb00, jmp 1d
  1b     jumps to 1d
  1d     jumps to 0a
  0a     edx = 2
  0b     ecx = 0, zflag=T
  11     al = 1 (because zflag=T)
  14     al = 2
  16     instruction at 1b becomes eb02, jmp 1f
  1b     jumps to 1f
  1f     instruction at 24 becomes eb02, jmp 28
  24     jumps to 28
  28     eax = 2
  2a     returns

Based on that, the instruction at L1 should never become EBBD (it's only ever changed to EB00 or EB02 ) so it's far more likely that what you have there is a simple misprint of the text (especially given the error in the second JMP L which is never modified). I'd guess authors are no more perfect than the rest of us:-)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM