how to tell the difference between lc3 opcodes and lc3 processor directives

Question

I've been learning lc3 and I'm writing a disassembler . I have a question about how, when reading assembled lc3 code, to tell the difference between an opcode and a processor directive (for example, a .FILL directive).

Now, LC3 instructions are 16 bits wide. Here's an example:

0000100000000111

This example is a BR (branch) instruction. The first 4 numbers from the left (0000) indicate this is a BR opcode, and the 3 numbers after, (100) are the conditions to test (negative, zero, and positive, in that order. So 100 means test for negative only, 110 would mean test for negative or zero, etc). If none of these three bits are set, then this is in fact, not a BR instruction, but instead it is a processor directive. So the following:

0000000001001001

the first 4 from the left numbers are 0000, so normally this would be a BR instruction, but because none of the Neg, Zero, Pos bits are set, then it is instead a .FILL processor directive, meaning that all 16 bits are instead computed into one hex number, in this case 0x0049.

Now this is just one of 16 opcodes. I also happen to know that the opcode for TRAP instructions (1111) has a similar special case, ie if ANY of the same three bits (plus one more to the right) are set, then it is not a TRAP instruction. Now, since opcode 0000 and opcode 1111 have these special cases, it makes sense that the opcodes in between should have special cases like this too. But I've gone through all documentation that I can find, and nowhere seems to mention these cases (although, I know that I'm right from reading the .lst files produced by lc3 assembler). Does anyone know of any documentation that mentions this? Or does anyone know of any more of these 'special cases'? Thanks

EDIT: Thanks for your reply, but I'm still not sure how I can tell the difference between an instruction and data. Here are a few lines I just copied from a .lst file:

(305A) 0059  0000000001011001 (  45)                 .FILL x0059
(305B) 0000  0000000000000000 (  45)                 .FILL x0000
(305C) FFD0  1111111111010000 (  46) RESET           .FILL xFFD0
(305D) 0045  0000000001000101 (  47) LINE1           .FILL x0045

Notice that the first two lines and the last lines could easily be read as BR instructions if it wasn't for the fact that none of the NZP bits are set. And the same goes for the 3rd line and TRAP instructions. So, what about this:

0101001001100000   ;; or 0x5260

Is this an AND instruction or is it a .FILL directive? How to tell?

Answer 1

.FILL can emit arbitrary bytes into the output file.

There's literally no difference in the machine code between manually encoding an instruction and emitting it with .fill vs. letting the assembler encode it from a mnemonic + operands . See How to avoid executing variables in lc3 assembly for an example.

When the CPU is decoding / executing machine code, it doesn't care how the bytes got there, it just treats them as instructions. Your disassembler should do the same.

If you encounter an instruction word that isn't a valid LC-3 instruction, you might simply disassemble it as .fill 0x1234 or something, then move on to the next word. This is what existing disassemblers do for ISAs like ARM and x86.

how to tell the difference between lc3 opcodes and lc3 processor directives

Question

1 answers

solution1
1 2018-11-21 18:31:31

how to tell the difference between lc3 opcodes and lc3 processor directives

Question

1 answers

solution1 1 2018-11-21 18:31:31

solution1
1 2018-11-21 18:31:31