简体   繁体   English

Ritchie和Kernighan混淆的C程序

[英]C program confusing from Ritchie and Kernighan

I have read everywhere that C is a must know language before you graduate college if you are in computer science. 我到处都读到,如果您是计算机科学专业的学生,​​那么在大学毕业之前,C是必不可少的语言。 I have picked up the book Ritchie and Kernighan 2nd edition recommended by my professor. 我已经拿到了我的教授推荐的Ritchie and Kernighan第二版。 I am quite confused by how this program is working. 我对该程序的工作方式感到困惑。 If anyone could explain it that would be great! 如果有人可以解释,那就太好了! The program counts the occurrences of each digit, white spaces and other characters. 该程序计算每个数字,空格和其他字符的出现次数。 I wrote comments for things I understand already and questions are added in there where I don't understand how it worked. 我为已经了解的事情写了评论,并且在不了解它如何工作的地方添加了问题。

#include <stdio.h>

int main()
{
     int c, i, nwhite, mother;   //Declaring variables
     int ndigit[10];             //Declaring an array that holds indexes from 0-9

     nwhite = nother = 0;        //Initializes two variables to 0

     for (i = 0; i < 10; ++i)    //Does this for loop keep going through and for each index
             ndigit[i] = 0;      //0-9 set the value of it to 0?

     white ((c = getchar()) !=EOF)  //While loop to check if char at EOF
     {
          if (c>='0' && c <= '9')        //Completely lost on how this works?
               ++ndigit[c - '0'];
          else if (c == ' ' || c == '\n' || c == '\t')     //Checks for white space 
               ++nwhite;
          else                                   //If nothing else increment char variable
               ++nother;

     printf("digits ");               //First for loop traverses and prints each value of
     for (i=0; i<10; ++i)             //the index. The next printf prints white space and
          printf(" %d", ndigit[i]);   //other characters
      printf(", white space = %d, other = other = %d\n", nwhite, nother);
}

Yes, this loops over the entire ndigit array and sets each element to 0. 是的,这会遍历整个ndigit数组并将每个元素设置为0。

for (i = 0; i < 10; ++i)   
    ndigit[i] = 0; 

here: 这里:

if (c>='0' && c <= '9')
    ++ndigit[c - '0'];

the condition (on the if -line) might be the most difficult thing to understand: 条件(上如果直插式)可能是最困难的事情理解:

image it like this: 像这样成像:

if( A and B)
    ...

with: 有:

A = c>'0'

and

B = c<= '9'

that is, if c is between 0 and 9 (including). 也就是说,如果c在0到9之间(包括)。

hope this helps. 希望这可以帮助。

(c >= '0' && c <= '9')        //Completely lost on how this works?

The char type in C is just an 8-bit integer, usually signed. C语言中的char类型只是一个8位整数,通常是带符号的。 The correspondence between characters such as 1 and 9 and integers like 63 is defined by the ASCII code . 字符(如19与整数(如63)之间的对应关系由ASCII码定义。 The code snipped above evaluates to 1 if c is a digit. 如果c是数字,则上面截断的代码的值为1。 Writing '1' in C is equivalent to writing 49 . 用C写'1'等效于写49

The array of integers declared is of type auto, and in C auto storage class integer array will have garbage values. 声明的整数数组的类型为auto,在C自动存储类中,整数数组将具有垃圾值。 Hence these are assigned with Zero. 因此,这些分配为零。

for (i = 0; i < 10; ++i)   
   ndigit[i] = 0; 

For the if statement you are checking whether the given character is between 0-9. 对于if语句,您正在检查给定字符是否在0-9之间。 Since its getchar()- (get characters), values are compared with single quote 由于其getchar()-(获取字符),因此将值与单引号进行比较

if (c>='0' && c <= '9') 

You can also change this statement to 您也可以将此语句更改为

if (c>=48 && c <= 57)

Since ASCII character for '0' is 48 and '9' is 57 由于ASCII 字符 “ 0”为48,而“ 9”为57

++ndigit[c - '0'];

You are keeping a count of numbers entered in the array, ie lets say if input is 5 then your increasing the count of the array at index 5 by 1. 您要对输入到数组中的数字进行计数,即假设输入为5,则将索引5处的数组计数增加1。

Here c-'0' is mentioned because, As mentioned above ASCII character for '5' is 53. So, the above statement results to 53 - 48 = 5 (index 5 of the array) 这里提到c-'0'是因为,如上所述,'5'的ASCII字符为53。因此,以上语句的结果为c-'0' = 5 (数组的索引5)

 int ndigit[10];             //Declaring an array that holds indexes from 0-9

 for (i = 0; i < 10; ++i)    //Does this for loop keep going through and for each index

In C arrays are indexed starting from 0, so in this case the valid indices of ndigit are 0 through 9, inclusive. 在C数组中,索引从0开始,因此在这种情况下, ndigit的有效索引为0到9(含0和9)。 The for loop starts from i = 0 and increments i by one ( ++i ) as long as i < 10 , so the loop body gets executed for values of i from 0 to 9, inclusive. for循环开始从i = 0和增量i被一个( ++i ),只要i < 10 ,所以循环体获取的值来执行i从0到9,包括端值。

In this program it is left up to the reader to figure out that the two 10 literals are the same thing; 在此程序中,由读者自己确定这两个10字面是同一回事。 arguably it would in better style to #define a constant and use it in both places (but in this case it might be forgiven since the number of decimal digits is constant at 10). 可以说,最好是#define一个常量并在两个地方都使用它(但是在这种情况下,可以原谅,因为小数位数恒定为10)。

 while ((c = getchar()) !=EOF)  //While loop to check if char at EOF

getchar returns one character read from the standard input, or EOF (which has a value distinct from any char ) if the standard input is at end of file. getchar返回一个字符从标准输入读取, EOF (其具有从任何不同的值char )如果标准输入是在文件末尾。 There is no EOF character that would signal this condition. 没有EOF字符可表示此情况。

      if (c>='0' && c <= '9')        //Completely lost on how this works?

The values of the characters for decimal digits 0 through 9 are contiguous in ASCII and most other character sets, so this code takes advantage of that and checks if c is equal to or greater than '0' and ( && ) also equal to or less than '9' . 十进制数字0到9的字符值在ASCII和大多数其他字符集中是连续的,因此此代码利用了这一点并检查c是否等于或大于'0' 并且&& )也等于或小于而不是'9' In other words, this checks if c is in the range between '0' and '9' , inclusive. 换句话说,这将检查c是否在'0''9'之间的范围内(包括'0''9' '0' and '9' are character literals that have integer values of the corresponding characters in the character set – this way the programmer does not have to know and write their values if (c>=48 && c<=57) , and the code works on incompatible character sets as long as the digit characters have contiguous values. '0''9'是字符文字,具有字符集中相应字符的整数值–这样, if (c>=48 && c<=57) ,程序员不必知道并写入它们的值。该代码适用于不兼容的字符集,只要数字字符具有连续的值即可。

           ++ndigit[c - '0'];

This counts the number of times each digit occurs. 这将计算每个数字出现的次数。 Again since the characters representing the digits are contiguous, subtracting the value of the first digit ( '0' ) from the character that was read results in values 0 through 9 for the corresponding characters, which are also the valid indices of the array ndigit as discussed above. 同样,由于代表数字的字符是连续的,因此从读取的字符中减去第一个数字的值( '0' )得出对应字符的值0到9,这也是数组ndigit的有效索引,如以上讨论。 Of course this subtraction only works on digit characters, which is why the preceding if checks for that. 当然,该减法仅适用于数字字符,这就是前面的if对此进行检查的原因。

For example, in ASCII '0' has the value 48, '1' is 49, etc. So if c was '1' , then the result of the subtraction c - '0' is '1'-'0' , ie, 49-48, or 1. 例如,在ASCII中, '0'值为48, '1'为49,依此类推。因此,如果c'1' ,则减法c - '0''1'-'0' ,即,49-48或1。

The ++ operator increments the value in the corresponding index of the ndigit array. ++运算符增加ndigit数组的相应索引中的ndigit (The first for loop set the initial values of in ndigit to zero.) (第一个for循环将ndigit的初始值设置为零。)

EOF - an abbreviation for End-of-file - is a macro defined in the stdio.h header file. EOF End-of-file的缩写-是在stdio.h头文件中定义的宏。 A macro is a fragment of code which has been given a name. 宏是已命名的代码片段。 Wherever the name is used, it is replaced by the contents of the macro. 无论在何处使用该名称,该名称都将替换为宏的内容。 It is a value which represents a condition when no more characters can be read from a source, which in this case, is the stdin - the standard input stream, usually a keyboard. 它是一个值,表示当无法从源中读取更多字符时的条件,在这种情况下,该源为stdin标准输入流,通常是键盘。 In short, for any character c , the condition ch == EOF is always false. 简而言之,对于任何字符c ,条件ch == EOF始终为false。 So the following condition will keep reading from the stdin until you signal EOF by pressing Ctrl+D on *nix systems and Ctrl+Z on Windows systems: 因此,在您通过* nix系统上的Ctrl+D和Windows系统上的Ctrl+Z发出EOF信号之前,以下条件将一直从stdin读取:

white((c = getchar()) != EOF) {
}

Next, this condition checks whether the character read c , is a numeric character, ie, a decimal digit. 接下来,此条件检查读取的字符c是否为数字字符,即十进制数字。

if(c>='0' && c <= '9') {  // check if c is a numeric character.
} 

The characters in C are actually integer values (think of them as 1-byte integers) and are equal to ASCII codes of the corresponding character (though this is not mandated by the C standard). C中的字符实际上是整数值(认为它们是1-byte整数),并且等于相应字符的ASCII码(尽管C标准没有强制要求)。 In ASCII, the code values for characters 0 through 9 are in increasing order. 在ASCII中,字符09的代码值按升序排列。 So, if the above condition is satisfied, then this means c is a numeric character. 因此,如果满足上述条件,则意味着c是数字字符。

++ndigit[c - '0'];

In the above statement, c - '0' is the numeric character converted to integer value. 在上面的语句中, c - '0'是转换为整数值的数字字符。 So, if the value of c was '8' , then c - '0' evaluates to 8 . 因此,如果c的值为'8' ,则c - '0'计算结果为8 Next, it fetches the value in the array ndigit at index 8 , which keeps count of the digit 8 , and increments it by 1 . 接下来,它在索引为8的数组ndigit中获取值,该值保留数字8计数,并将其递增1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM