简体   繁体   English

将字符串转换为数字

[英]Converting string to a number

I came across this C program: 我遇到了这个C程序:

int main() {
    printf("Enter your address, (e.g. 51 Anzac Road) ");
    gets(address);
    number = 0;
    i = 0;
    while (address[i] != ' ') {
        number = number * 10 + (address[i] - 48);
        i++;
    }
}

I understand number = number * 10 + (address[i] - 48); 我知道number = number * 10 + (address[i] - 48); is to get the number from input, but can anybody explain to me how this works? 是从输入中获取数字,但是有人可以向我解释这是如何工作的吗? How does that produce the number from the input? 如何从输入中产生数字?

In ASCII, the digit characters '0' through '9' occupy code points 48 through 57 (i hex, 0x30 through 0x39 ) so, to turn a digit character into a value, you just subtract 48. 在ASCII中,数字字符 '0''9'占据代码点48至57(i hex, 0x300x39 ),因此,要将数字字符转换为值,只需减去48。


As an aside, you should really subtract '0' since the standard doesn't guarantee ASCII, though it does guarantee that the digit characters are contiguous and ordered. 顺便说一句,您确实应该减去'0'因为该标准不保证ASCII,尽管它确实保证了数字字符是连续且有序的。 C under z/OS, for example, uses EBCDIC which places the digits at code points 0xf0 through 0xf9 . 例如,z / OS下的C使用EBCDIC,它将数字放置在代码点0xf00xf9


The loop itself is a simple shift-and-add type, to create a number from multiple digit characters. 循环本身是一种简单的移位加法类型,可以从多个数字字符中创建一个数字。 Say you have the string "123" , and number is initially zero. 假设您有字符串"123" ,且number最初为零。

You multiply number (zero) by ten to get zero then add digit character '1' (49) and subtract 48. This gives you one. number (零)乘以10得到零,然后加上数字字符'1' (49)并减去48。这得到一个。

You then multiply number (one) by ten to get ten and add digit character '2' (50), again subtracting 48. This gives you twelve. 然后,将number (一)乘以十得到十,并加上数字字符'2' (50),再减去48。这便得出十二。

Finally, you multiply number (twelve) by ten to get a hundred and twenty then add digit character '3' (51) and subtract 48. This gives you a hundred and twenty three. 最后,将number (十二)乘以十得到一百二十,然后加上数字字符'3' (51)并减去48。这将得出一百二十三。

There are better ways to do this in the C standard library, atoi or the more robust strtol -type functions, all found in stdlib.h . 在C标准库, atoi或更强大的strtol型函数中,有更好的方法可以在stdlib.h找到它们。 The latter allow you to better detect if there was "rubbish" at the end of the number, for assistance with validation ( atoi cannot tell the difference between 123 and 123xyzzy ). 后者使您可以更好地检测数字末尾是否有“垃圾”,以帮助进行验证( atoi无法分辨123123xyzzy之间的区别)。


And, as yet another aside, you should avoid gets() like the plague. 而且,除其他外,您应该避免像瘟疫一样使用gets() It, like the "naked" scanf("%s") , is not suitable for user input, and opens your code to buffer overflow problems. 就像“ naked” scanf("%s") ,它不适合用户输入,并打开您的代码以缓冲溢出问题。 In fact, unlike scanf() , there is no safe way to use gets() , which is undoubtedly why it has been removed from C11, the latest standard. 实际上,与scanf()不同, 没有安全的方法可以使用gets() ,这无疑就是为什么将其从最新标准C11中删除的原因。 A more robust user input function can be found here . 可以在此处找到更强大的用户输入功能。

There's also a large class of addresses for which that code will fail miserably, such as: 还有一大类地址,该代码将严重地失败,例如:

3/28 Tivoli Rd
57a Smith Street
Flat 2, 12 Xyzzy Lane

C requires the digits 0 through 9 to be stored contiguously, in that order, in the execution character set. C要求将数字09以此顺序连续存储在执行字符集中。 48 is the ASCII value of '0' , so, for instance: 48是ASCII值'0' ,因此,例如:

'3' - 48 == 3

for any digit. 对于任何数字。

ASCII is not required for C, so better is: C不需要ASCII,因此更好的是:

'3' - '0' 

because while 48 is right for ASCII, '0' is by definition right for any character set. 因为虽然48对ASCII是正确的,但从定义上来说, '0'对任何字符集都是正确的。

If address contains "456 " , then: 如果address包含"456 " ,则:

  • when i == 0 and number == 0 , number * 10 + (address[0] - 48) equals 0 * 10 + 4 , or 4 . i == 0number == 0number * 10 + (address[0] - 48)等于0 * 10 + 44
  • when i == 1 , number * 10 + (address[1] - 48) is 4 * 10 + 5 , or 45 . i == 1number * 10 + (address[1] - 48)4 * 10 + 545
  • when i == 2 , number * 10 + (address[2] - 48) is 45 * 10 + 6 , or 456 i == 2number * 10 + (address[2] - 48)45 * 10 + 6456

and you're done. 到此为止。

Never use gets() , it's dangerous, and isn't even part of C anymore. 永远不要使用gets() ,这很危险,甚至不再是C的一部分。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM