How should I work with dynamically-sized input in NASM Assembly?

Question

I'm trying to learn assembly with NASM on 64 bit Linux.

I managed to make a program that reads two numbers and adds them. The first thing I realized was that the program will only work with one-digit numbers (and results):

; Calculator

SECTION .data
    msg1    db  "Enter the first number: "
    msg1len equ $-msg1
    msg2    db  "Enter the second number: "
    msg2len equ $-msg2
    msg3    db  "The result is: "
    msg3len equ $-msg3

SECTION .bss
    num1    resb    1
    num2    resb    1
    result  resb    1

SECTION .text
    global main

main:
    ; Ask for the first number
    mov EAX,4
    mov EBX,1
    mov ECX,msg1
    mov EDX,msg1len
    int 0x80

    ; Read the first number
    mov     EAX,3
    mov     EBX,1
    mov ECX,num1
    mov EDX,2
    int 0x80

    ; Ask for the second number
    mov EAX,4
    mov EBX,1
    mov ECX,msg2
    mov EDX,msg2len
    int 0x80

    ; Read the second number
    mov     EAX,3
    mov     EBX,1
    mov ECX,num2
    mov EDX,2
    int 0x80

    ; Prepare to announce the result
    mov EAX,4
    mov EBX,1
    mov ECX,msg3
    mov EDX,msg3len
    int 0x80

    ; Do the sum
    ; Store read values to EAX and EBX
    mov EAX,[num1]
    mov EBX,[num2]

    ; From ASCII to decimal
    sub EAX,'0'
    sub EBX,'0'

    ; Add
    add EAX,EBX

    ; Convert back to EAX
    add EAX,'0'

    ; Save the result back to the variable
    mov [result],EAX

    ; Print result
    mov EAX,4
    mov EBX,1
    mov ECX,result
    mov EDX,1
    int 0x80

As you can see, I reserve one byte for the first number, another for the second, and one more for the result. This isn't very flexible. I would like to make additions with numbers of any size.

How should I approach this?

Answer 1

First of all you are generating a 32-bit program, not a 64-bit program. This is no problem as Linux 64-bit can run 32-bit programs if they are either statically linked (this is the case for you) or the 32-bit shared libraries are installed.

Your program contains a real bug: You are reading and writing the "EAX" register from a 1-byte field in RAM:

mov EAX, [num1]

This will normally work on little-endian computers (x86). However if the byte you want to read is at the end of the last memory page of your program you'll get a bus error.

Even more critical is the write command:

mov [result], EAX

This command will overwrite 3 bytes of memory following the "result" variable. If you extend your program by additional bytes:

num1 resb 1
num2 resb 1
result resb 1
newVariable1 resb 1

You'll overwrite these variables! To correct your program you must use the AL (and BL) register instead of the complete EAX register:

mov AL, [num1]
mov BL, [num2]
...
mov [result], AL

Another finding in your program is: You are reading from file handle #1. This is the standard output. Your program should read from file handle #0 (standard input):

mov EAX, 3 ; read
mov EBX, 0 ; standard input
...
int 0x80

But now the answer to the actual question:

The C library functions (eg fgets()) use buffered input. Doing it like this would be a bit to complicated for the beginning so reading one byte at a time could be a possibility.

Thinking the way "how would I solve this problem using a high-level language like C". If you don't use libraries in your assembler program you can only use system calls (section 2 man pages) as functions (eg you cannot use "fgets()" but only "read()").

In your case a C program reading a number from standard input could look like this:

int num1;
char c;
...
num1 = 0;
while(1)
{
    if(read(0,&c,1)!=1) break;
    if(c=='\r' || c=='\n') break;
    num1 = 10*num1 + c - '0';
}

Now you may think about the assembler code (I typically use GNU assembler, which has another syntax, so maybe this code contains some bugs):

c resb 1
num1 resb 4

...

    ; Set "num1" to 0
  mov EAX, 0
  mov [num1], EAX
    ; Here our while-loop starts
next_digit:
    ; Read one character
  mov EAX, 3
  mov EBX, 0
  mov ECX, c
  mov EDX, 1
  int 0x80
    ; Check for the end-of-input
  cmp EAX, 1
  jnz end_of_loop
    ; This will cause EBX to be 0.
    ; When modifying the BL register the
    ; low 8 bits of EBX are modified.
    ; The high 24 bits remain 0.
    ; So clearing the EBX register before
    ; reading an 8-bit number into BL is
    ; a method for converting an 8-bit
    ; number to a 32-bit number!
  xor EBX, EBX
    ; Load the character read into BL
    ; Check for "\r" or "\n" as input
  mov BL, [c]
  cmp BL, 10
  jz end_of_loop
  cmp BL, 13
  jz end_of_loop
    ; read "num1" into EAX
  mov EAX, [num1]
    ; Multiply "num1" with 10
  mov ECX, 10
  mul ECX
    ; Add one digit
  sub EBX, '0'
  add EAX, EBX
    ; write "num1" back
  mov [num1], EAX
    ; Do the while loop again
  jmp next_digit
    ; The end of the loop...
end_of_loop:
    ; Done

Writing decimal numbers with more digits is more difficult!

How should I work with dynamically-sized input in NASM Assembly?

Question

1 answers

solution1
4 ACCPTED 2013-09-11 05:24:45

How should I work with dynamically-sized input in NASM Assembly?

Question

1 answers

solution1 4 ACCPTED 2013-09-11 05:24:45

solution1
4 ACCPTED 2013-09-11 05:24:45