简体   繁体   中英

Assembly Jump/Branch/Lookup Tables instead of lots of cmp/je?

I've just started learning assembly and am making a simple boot loader as part of my OS class. I'm trying to make my code a little more efficient, ie I don't think what I've done so far is a particularly good way of achieving what I want. That said, I've been struggling to find any resources online that document a jump/branch/lookup table which I believe would be the most efficient way of doing this.

To explain what I'm trying to achieve, I'm calling a function which returns a value in the dx register, from 0 to 4. Currently I'm using cmp instructions one after the other to compare the value and make a conditional je jump if the value is the same. If I had written this in a higher level language, I'd essentially be doing multiple if statements one after the other rather than using a more efficient switch statement.

So here's what I'm doing right now:

cmp dx, 1          
je .F_1
cmp dx, 2
je .F_2
cmp dx, 3
je .F_3
cmp dx, 4
je .F_4
cmp dx, 0
je .F_5
jmp RangeError_Handler

.F1:
  mov   si, msg1
  jmp   F_Exit
.F2:
  mov   si, msg2
  jmp   F_Exit
...  ; .F3 and .F4 follow the pattern

.F5:             ; special case
  mov   si, msg_error
  call  PrintLn
  hlt

F_Exit:
  call  PrintLn
  ...            ; and do something else


msg1: db 'Message 1', 0
msg2: ...
...

There has to be a better way to do this. My tutor hinted that the jump table would be ideal but didn't have time to give me any kind of further explanation as to how that might work in assembly so I'd be extremely grateful if someone could provide some sort of example in the context of my situation.

Theoretically, I'd have one function that checked the value of dx and then jumped to a particular function rather than checking 5 separate times, I just can't see how I would implement that in assembly. Would it be more efficient to use a lookup table for the strings also? ie a return value of 1 would indicate string 1 in the table?

Most of your cases have the same instructions with different data, so you don't even need a jump table. Just use a table of strings, and only jump for the condition where different instructions need to run, not the same instructions with different data.

    mov  si, dx                   ; SI can be used in addressing modes, DX can't
    shl  si                       ; 16-bit doesn't allow scaled indices, so we can't just do [table + si*2].  And shl sets flags
    cmp  dx, 4
    ja   RangeError_Handler

    mov  si, [F_messages + si]
      ; call PrintLn   could be here, if it preserves DX or SI for us to test after

    test dx,dx             ; detect the one special case.
    jnz  .F_Exit

    ;; fall through only in the dx==0 case
    call  PrintLn
RangeError_Handler:
    hlt                             ; Are interrupts disabled?  if not, execution will continue after hlt

.F_exit
    call  PrintLn
    ...   ; and do whatever else your code needs to do


F_messages:                # char* F_messages[]
    dw  msg1,
        msg2
        ...

Using tables instead of a chain of conditional jumps is extremely widely applicable. The logic would be pretty much identical if this was 64-bit x86 code, or even ARM or MIPS assembly. Or even C. (A good C compiler might turn your switch into a table lookup for the data instead of a jump table).


You could factor the call PrintLn out of both sides of the branch, but only if it preserves either DX or SI. If you had to PUSH/POP the input value so you'd be able to test it again, it wouldn't be worth it. Since the special case is DX==0, (not DX==5 like the previous version of this answer had), we can't do both JCCs with the FLAGS from one CMP.


If you did want to make a jump table:

jmp  [jump_table + si]


jump_table:
   dw   .F_1,  .F_2, ...

And then instead of string addresses, use DW to make a table of code addresses in memory. If each case is the same size (in machine code bytes), you could avoid having a table of pointers, and just compute a jump distance relative to the address of the first one.


Make sure you know what CS is set to before using absolute addresses . Normal jumps are relative, but indirect jump/call use absolute addresses. As @MichaelPetch's comment points out, a FAR JMP at some point in your code will set CS for you.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM