I am looking for an extremely small way of turning a string like "123"
into an integer like 123
and vice-versa.
I will be working in a freestanding environment. This is NOT a premature optimization. I am creating code that must fit in 512 bytes, so every byte does actually count. I will take both x86 assembly(16 bit) and C code though(as that is pretty easy to convert)
It does not need to do any sanity checks or anything..
I thought I had seen a very small C implementation implemented recursively, but I can't seem to find anything for size optimization..
So can anyone find me(or create) a very small atoi/itoa implementation? (it only needs to work with base 10 though)
Edit: (the answer) (edited again because the first code was actually wrong) in case someone else comes upon this, this is the code I ended up creating. It could fit in 21 bytes!
;ds:bx is the input string. ax is the returned integer
_strtoint:
xor ax,ax
.loop1:
imul ax, 10 ;ax serves as our temp var
mov cl,[bx]
mov ch,0
add ax,cx
sub ax,'0'
inc bx
cmp byte [bx],0
jnz .loop1
ret
Ok, last edit I swear! Version weighing in at 42 bytes with negative number support.. so if anyone wants to use these they can..
;ds:bx is the input string. ax is the returned integer
_strtoint:
cmp byte [bx],'-'
je .negate
;rewrite to negate DX(just throw it away)
mov byte [.rewrite+1],0xDA
jmp .continue
.negate:
mov byte [.rewrite+1],0xD8
inc bx
.continue
xor ax,ax
.loop1:
imul ax, 10 ;ax serves as our temp var
mov dl,[bx]
mov dh,0
add ax,dx
sub ax,'0'
inc bx
cmp byte [bx],0
jnz .loop1
;popa
.rewrite:
neg ax ;this instruction gets rewritten to conditionally negate ax or dx
ret
With no error checking, 'cause that's for wussies who have more than 512B to play with:
#include <ctype.h>
// alternative:
// #define isdigit(C) ((C) >= '0' && (C) <= '9')
unsigned long myatol(const char *s) {
unsigned long n = 0;
while (isdigit(*s)) n = 10 * n + *s++ - '0';
return n;
}
gcc -O2
compiles this into 47 bytes, but the external reference to __ctype_b_loc
is probably more than you can afford...
I don't have an assembler on my laptop to check the size, but offhand, it seems like this should be shorter:
; input: zero-terminated string in DS:SI
; result: AX
atoi proc
xor cx, cx
mov ax, '0'
@@:
imul cx, 10
sub al, '0'
add cx, ax
lodsb
jnz @b
xchg ax, cx
ret
atoi endp
Write it yourself. Note that subtracting '0' from a digit gets the power-of-ten. So, you loop down the digits, and every time you multiply the value so far by 10, subtract '0' from the current character, and add it. Codable in assembly in no time flat.
And here is another one without any checking. It assumes a null terminated string. As a bonus, it checks for a negative sign. This takes 593 bytes with a Microsoft compiler (cl /O1).
int myatoi( char* a )
{
int res = 0;
int neg = 0;
if ( *a == '-' )
{
neg = 1;
a++;
}
while ( *a )
{
res = res * 10 + ( *a - '0' );
a++;
}
if ( neg )
res *= -1;
return res;
}
atoi(p) register char *p; { register int n; register int f;
n = 0;
f = 0;
for(;;p++) {
switch(*p) {
case ' ':
case '\t':
continue;
case '-':
f++;
case '+':
p++;
}
break;
}
while(*p >= '0' && *p <= '9')
n = n*10 + *p++ - '0';
return(f? -n: n);
}
如果使用-Os(优化空间)而不是-O2,是否有任何尺寸更小?
您可以尝试将字符串打包成BCD(0x1234),然后使用x87 fbld和第一条指令来获得20世纪80年代的解决方案,但我不确定它是否会更小,因为我不记得有任何打包指令。
How in the world are you people getting the executables so small?! This code generates a 316 byte .o file when compiled with gcc -Os -m32 -c -o atoi.o atoi.c
and a 8488 byte executable when compiled and linked (with an empty int main(){}
added) with gcc -Os -m32 -o atoi atoi.c
. This is on Mac OS X Snow Leopard...
int myatoi(char *s)
{
short retval=0;
for(;*s!=0;s++) retval=retval*10+(*s-'0');
return retval;
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.