简体   繁体   中英

When incrementing an element of an array in C, why must ++array[c - '0'] be used and not ++array[c]?

I've been reading The C Programming Language 2nd edition by Brian W Kernighan and Dennis Ritchie in order to learn the C programming language. In it is an example of code that counts the number of times a digit occurs in a string of input. The code is as follows:

#include <stdio.h> 

main () {

    int c, i ,nwhite, nother;
    int ndigit[10];

    nwhite = nother = 0;
    for (i =0; i < 10; ++i)
        ndigit[i] = 0;

    while ((c = getchar()) != EOF) {
        if (c >= '0' && c <= '9')
            ++ndigit[c - '0'];
        else if (c == ' ' || c == '\n' || c == '\t')
            ++nwhite;
        else
            ++nother;
    }
    printf ("\ndigits =");
    for (i = 0; i < 10; ++i)
        printf (" %d", ndigit[i]);
    printf (", white space = %d, other = %d\n", nwhite, nother);
}

If after starting this program, I were to enter "11111", the first if statement of the while loop would notice that these are a series of digits and would thus increment the second element of the array to 5. I'm trying to understand the C language as best as I can and i just don't see the logic behind using

++ndigit[c - '0'];

If I were to input "11111" into the program, it would correctly return something along the lines of

0 5 0 0 0 0 0 0 0 0

and thus indicating that "1" was typed five times. Intuitively, I would instead type

++ndigit[c];

It seems to me that because the variable c would be 1 five times, this bit of code would correctly increment not the 0th element but the 1st element of the array to 5, just as it should. If I actually implement this bit of code, however, the same input of "11111" returns

0 0 0 0 0 0 0 0 0 0

This I don't understand at all. Now, it seems no elements of the array are being incremented despite the fact I told it to increment the cth element.

Just some further testing: I went ahead to see what would happen if I implemented

++ndigit[c - '1'];

The same input of "11111" returned

5 0 0 0 0 0 0 0 0 0

which I suppose makes sense as its changing the (1-1)th element of the array. I just still don't understand why the "- '0'" is necessary. If you could help me to understand the usage of this, that would be great. Thanks.

Because c is a char obtained with getchar() , and the character '0' is not equivalent to the number 0 . The symbol '0' is actually 48 in the ASCII table.

When you do c - '0' you transform the character '0' into the actual 0 you want, '1' into 1 and so on because those symbols are sequenced.

'0' == 48
'1' == 49
'2' == 50
'3' == 51
'4' == 52
'5' == 53
'6' == 54
'7' == 55
'8' == 56
'9' == 57

It's because the characters use the ASCII encoding. This means that if you took the bits in the character '0' and made a integer out of those bits you would get an integer that has the value 48.

Because you want your characters to line up with the integers they represent you need to take into account that offset and subtract 48 or '0' from each one.

The only ASCII characters that you're counting are '0' through '9' , inclusive. You're using an array of 10 int s, ndigits , to count how many occurrences of the above characters there have been. Because '0' does not equal 0 (it equals 48 ), you must offset it to access the correct element of ndigits :

++ndigit[c - '0']; // '0' is the offset

If you didn't offset the ASCII character, you would need an array with a length of 58 to be able to do something like this:

++ndigit['9'];

This would be wasteful of memory because elements 0 to 47 would never be used.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM