简体   繁体   中英

Smallest method of turning a string into an integer(and vice-versa)

I am looking for an extremely small way of turning a string like "123" into an integer like 123 and vice-versa.

I will be working in a freestanding environment. This is NOT a premature optimization. I am creating code that must fit in 512 bytes, so every byte does actually count. I will take both x86 assembly(16 bit) and C code though(as that is pretty easy to convert)

It does not need to do any sanity checks or anything..

I thought I had seen a very small C implementation implemented recursively, but I can't seem to find anything for size optimization..

So can anyone find me(or create) a very small atoi/itoa implementation? (it only needs to work with base 10 though)

Edit: (the answer) (edited again because the first code was actually wrong) in case someone else comes upon this, this is the code I ended up creating. It could fit in 21 bytes!

;ds:bx is the input string. ax is the returned integer
_strtoint:
    xor ax,ax
    .loop1:
        imul ax, 10 ;ax serves as our temp var
        mov cl,[bx]
        mov ch,0
        add ax,cx
        sub ax,'0'
        inc bx
        cmp byte [bx],0
    jnz .loop1
ret

Ok, last edit I swear! Version weighing in at 42 bytes with negative number support.. so if anyone wants to use these they can..


;ds:bx is the input string. ax is the returned integer
_strtoint:
    cmp byte [bx],'-'
    je .negate
    ;rewrite to negate DX(just throw it away)
    mov byte [.rewrite+1],0xDA
    jmp .continue
    .negate:
    mov byte [.rewrite+1],0xD8
    inc bx
    .continue
    xor ax,ax
    .loop1:
        imul ax, 10 ;ax serves as our temp var
        mov dl,[bx]
        mov dh,0
        add ax,dx
        sub ax,'0'
        inc bx
        cmp byte [bx],0
    jnz .loop1
    ;popa
    .rewrite:
    neg ax ;this instruction gets rewritten to conditionally negate ax or dx
ret

With no error checking, 'cause that's for wussies who have more than 512B to play with:

#include <ctype.h>
// alternative:
// #define isdigit(C) ((C) >= '0' && (C) <= '9')

unsigned long myatol(const char *s) {
  unsigned long n = 0;
  while (isdigit(*s)) n = 10 * n + *s++ - '0';
  return n;
}

gcc -O2 compiles this into 47 bytes, but the external reference to __ctype_b_loc is probably more than you can afford...

I don't have an assembler on my laptop to check the size, but offhand, it seems like this should be shorter:

; input: zero-terminated string in DS:SI
; result: AX
atoi proc
        xor cx, cx
        mov ax, '0'
    @@:
        imul cx, 10
        sub al, '0'
        add cx, ax
        lodsb
        jnz @b
        xchg ax, cx
        ret
atoi endp

Write it yourself. Note that subtracting '0' from a digit gets the power-of-ten. So, you loop down the digits, and every time you multiply the value so far by 10, subtract '0' from the current character, and add it. Codable in assembly in no time flat.

And here is another one without any checking. It assumes a null terminated string. As a bonus, it checks for a negative sign. This takes 593 bytes with a Microsoft compiler (cl /O1).

int myatoi( char* a )
{
   int res = 0;
   int neg = 0;

   if ( *a == '-' )
      {
      neg = 1;
      a++;
      }

   while ( *a )
      {
      res = res * 10 + ( *a - '0' );
      a++;
      }

   if ( neg )
      res *= -1;

   return res;
}

atoi(p) register char *p; { register int n; register int f;

n = 0;
f = 0;
for(;;p++) {
    switch(*p) {
    case ' ':
    case '\t':
        continue;
    case '-':
        f++;
    case '+':
        p++;
    }
    break;
}
while(*p >= '0' && *p <= '9')
    n = n*10 + *p++ - '0';
return(f? -n: n);

}

如果使用-Os(优化空间)而不是-O2,是否有任何尺寸更小?

您可以尝试将字符串打包成BCD(0x1234),然后使用x87 fbld和第一条指令来获得20世纪80年代的解决方案,但我不确定它是否会更小,因为我不记得有任何打包指令。

How in the world are you people getting the executables so small?! This code generates a 316 byte .o file when compiled with gcc -Os -m32 -c -o atoi.o atoi.c and a 8488 byte executable when compiled and linked (with an empty int main(){} added) with gcc -Os -m32 -o atoi atoi.c . This is on Mac OS X Snow Leopard...

int myatoi(char *s)
{
 short retval=0;
 for(;*s!=0;s++) retval=retval*10+(*s-'0');
 return retval;
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM