简体   繁体   中英

Converting string to a number

I came across this C program:

int main() {
    printf("Enter your address, (e.g. 51 Anzac Road) ");
    gets(address);
    number = 0;
    i = 0;
    while (address[i] != ' ') {
        number = number * 10 + (address[i] - 48);
        i++;
    }
}

I understand number = number * 10 + (address[i] - 48); is to get the number from input, but can anybody explain to me how this works? How does that produce the number from the input?

In ASCII, the digit characters '0' through '9' occupy code points 48 through 57 (i hex, 0x30 through 0x39 ) so, to turn a digit character into a value, you just subtract 48.


As an aside, you should really subtract '0' since the standard doesn't guarantee ASCII, though it does guarantee that the digit characters are contiguous and ordered. C under z/OS, for example, uses EBCDIC which places the digits at code points 0xf0 through 0xf9 .


The loop itself is a simple shift-and-add type, to create a number from multiple digit characters. Say you have the string "123" , and number is initially zero.

You multiply number (zero) by ten to get zero then add digit character '1' (49) and subtract 48. This gives you one.

You then multiply number (one) by ten to get ten and add digit character '2' (50), again subtracting 48. This gives you twelve.

Finally, you multiply number (twelve) by ten to get a hundred and twenty then add digit character '3' (51) and subtract 48. This gives you a hundred and twenty three.

There are better ways to do this in the C standard library, atoi or the more robust strtol -type functions, all found in stdlib.h . The latter allow you to better detect if there was "rubbish" at the end of the number, for assistance with validation ( atoi cannot tell the difference between 123 and 123xyzzy ).


And, as yet another aside, you should avoid gets() like the plague. It, like the "naked" scanf("%s") , is not suitable for user input, and opens your code to buffer overflow problems. In fact, unlike scanf() , there is no safe way to use gets() , which is undoubtedly why it has been removed from C11, the latest standard. A more robust user input function can be found here .

There's also a large class of addresses for which that code will fail miserably, such as:

3/28 Tivoli Rd
57a Smith Street
Flat 2, 12 Xyzzy Lane

C requires the digits 0 through 9 to be stored contiguously, in that order, in the execution character set. 48 is the ASCII value of '0' , so, for instance:

'3' - 48 == 3

for any digit.

ASCII is not required for C, so better is:

'3' - '0' 

because while 48 is right for ASCII, '0' is by definition right for any character set.

If address contains "456 " , then:

  • when i == 0 and number == 0 , number * 10 + (address[0] - 48) equals 0 * 10 + 4 , or 4 .
  • when i == 1 , number * 10 + (address[1] - 48) is 4 * 10 + 5 , or 45 .
  • when i == 2 , number * 10 + (address[2] - 48) is 45 * 10 + 6 , or 456

and you're done.

Never use gets() , it's dangerous, and isn't even part of C anymore.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM