I've been learning lc3 and I'm writing a disassembler . I have a question about how, when reading assembled lc3 code, to tell the difference between an opcode and a processor directive (for example, a .FILL directive).
Now, LC3 instructions are 16 bits wide. Here's an example:
0000100000000111
This example is a BR (branch) instruction. The first 4 numbers from the left (0000) indicate this is a BR opcode, and the 3 numbers after, (100) are the conditions to test (negative, zero, and positive, in that order. So 100 means test for negative only, 110 would mean test for negative or zero, etc). If none of these three bits are set, then this is in fact, not a BR instruction, but instead it is a processor directive. So the following:
0000000001001001
the first 4 from the left numbers are 0000, so normally this would be a BR instruction, but because none of the Neg, Zero, Pos bits are set, then it is instead a .FILL processor directive, meaning that all 16 bits are instead computed into one hex number, in this case 0x0049.
Now this is just one of 16 opcodes. I also happen to know that the opcode for TRAP instructions (1111) has a similar special case, ie if ANY of the same three bits (plus one more to the right) are set, then it is not a TRAP instruction. Now, since opcode 0000 and opcode 1111 have these special cases, it makes sense that the opcodes in between should have special cases like this too. But I've gone through all documentation that I can find, and nowhere seems to mention these cases (although, I know that I'm right from reading the .lst files produced by lc3 assembler). Does anyone know of any documentation that mentions this? Or does anyone know of any more of these 'special cases'? Thanks
EDIT: Thanks for your reply, but I'm still not sure how I can tell the difference between an instruction and data. Here are a few lines I just copied from a .lst file:
(305A) 0059 0000000001011001 ( 45) .FILL x0059
(305B) 0000 0000000000000000 ( 45) .FILL x0000
(305C) FFD0 1111111111010000 ( 46) RESET .FILL xFFD0
(305D) 0045 0000000001000101 ( 47) LINE1 .FILL x0045
Notice that the first two lines and the last lines could easily be read as BR instructions if it wasn't for the fact that none of the NZP bits are set. And the same goes for the 3rd line and TRAP instructions. So, what about this:
0101001001100000 ;; or 0x5260
Is this an AND instruction or is it a .FILL directive? How to tell?
.FILL
can emit arbitrary bytes into the output file.
There's literally no difference in the machine code between manually encoding an instruction and emitting it with .fill
vs. letting the assembler encode it from a mnemonic + operands . See How to avoid executing variables in lc3 assembly for an example.
When the CPU is decoding / executing machine code, it doesn't care how the bytes got there, it just treats them as instructions. Your disassembler should do the same.
If you encounter an instruction word that isn't a valid LC-3 instruction, you might simply disassemble it as .fill 0x1234
or something, then move on to the next word. This is what existing disassemblers do for ISAs like ARM and x86.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.