Why is this array being initialized in an odd way?

Question

I am reading K&R 2nd Edition and I am having trouble understanding exercise 1-13. The answer is this code

#include <stdio.h>

#define MAXHIST 15  
#define MAXWORD 11  
#define IN 1        
#define OUT 0      


main()
{

    int c, i, nc, state;
    int len;
    int maxvalue;
    int ovflow;
    int wl[MAXWORD];

    state = OUT;
    nc = 0;         
    ovflow = 0;

    for (i = 0; i < MAXWORD; i++)
        wl[i] = 0;  

    while ((c = getchar()) != EOF)
    {
        if(c == ' ' || c == '\n' || c == '\t')
        {
            state = OUT;            
            if (nc > 0)
            {
                if (nc < MAXWORD)   
                    ++wl[nc];       
                else
                    ++ovflow;       
            }                       
            nc = 0;                 
        }
        else if (state == OUT)
        {
            state = IN;             
            nc = 1;                 
        }
        else
            ++nc;                   
    }

    maxvalue = 0;
    for (i = 1; i < MAXWORD; ++i)
    {
        if(wl[i] > maxvalue)
            maxvalue = wl[i];       
    }

    for(i = 1; i < MAXWORD; ++i)
    {
        printf("%5d - %5d : ", i, wl[i]);
        if(wl[i] > 0)
        {
            if((len = wl[i] * MAXHIST / maxvalue) <= 0)
                len = 1;
        }
        else
            len = 0;

        while(len > 0)
        {
            putchar('*');
            --len;
        }
        putchar('\n');
    }

    if (ovflow > 0)
        printf("There are %d words >= %d\n", ovflow, MAXWORD);

    return 0;

}

At the top, wl is being declared and initialized. What I don't understand is why is it looping through it and setting everything to zero if it just counts the length of words? It doesn't keep track of how many words there are, it just keeps track of the word length so why is everything set to 0?

I know this is unclear it's just been stressing me out for the past 20 minutes and I don't know why.

Answer 1

The i th element of the array wl[] is the number of words of length i that have been found in an input file. The wl[] array needs to be zero-initialized first so that ++wl[nc]; does not cause undefined behavior by attempting to use an uninitialized variable, and so that array elements that represent word lengths that are not present reflect that no such word lengths were found.

Note that ++wl[nc] increments the value wl[nc] when a word of length nc is encountered. If the array were not initialized, the first time the code attempts to increment an array element, it would be attempting to increment an indeterminate value. This attempt would cause undefined behavior.

Further, array indices that represent counts of word lengths that are not found in the input should hold values of zero, but without the zero-initialization, these values would be indeterminate. Even attempting to print these indeterminate values would cause undefined behavior.

The moral: initialize variables to sensible values, or store values in them, before attempting to use them.

It would seem simpler and be more clear to use an array initializer to zero-initialize the wl[] array:

int wl[MAXWORD] = { 0 };

After this, there is no need for the loop that sets the array values to zero (unless the array is used again) for another file. But, the posted code is from The C Answer Book by Tondo and Gimpel. This book provides solutions to the exercises found in the second edition of K&R in the style of K&R, and using only ideas that have been introduced in the book before each exercise. This exercise, 1.13, occurs in "Chapter 1 - A Tutorial Introduction". This is a brief tour of the language lacking many details to be found later in the book. At this point, assignment and arrays have been introduced, but array initializers have not (this has to wait until Chapter 4), and the K&R code that uses arrays has initialized arrays using loops thus far. Don't read too much into code style from the introductory chapter of a book that is 30+ years old.

Much has changed in C since K&R was published, eg, main() is no longer a valid function signature for the main() function. Note that the function signature must be one of int main(void) or int main(int argc, char *argv[]) (or alternatively int main(int argc, char **argv) ), with a caveat for implementation-defined signatures for main() .

Answer 2

Everything is set to 0 because if you dont initialize the array, the array will be initialize with random number in it. Random number will cause error in your program. Instead of looping in every position of your array you could do this int wl[MAXWORD] = {0}; at the place of int wl[MAXWORD]; this will put 0 at every position in your array so you dont hava to do the loop.

Answer 3

I edited your code and put some comments in as I was working through it, to explain what's going on. I also changed some of your histogram calculations because they didn't seem to make sense to me.

Bottom line: It's using a primitive "state machine" to count up the letters in each group of characters that isn't white space. It stores this in wl[] such that wl[i] contains an integer that tells you how many groups of characters (sometimes called "tokens") has a word length of i . Because this is done by incrementing the appropriate element of w[] , each element must be initialized to zero. Failing to do so would lead to undefined behavior, but probably would result in nonsensical and absurdly large counts in each element of w[] .

Additionally, any token with a length that can't be reflected in w[] will be tallied in the ovflow variable, so at the end there will be an accounting of every token.

#include <stdio.h>

#define MAXHIST 15  
#define MAXWORD 11  
#define IN 1        
#define OUT 0      


int main(void) {
  int c, i, nc, state;
  int len;
  int maxvalue;
  int ovflow;
  int wl[MAXWORD];

  // Initializations
  state = OUT;  //Start off not assuming we're IN a word
  nc = 0;       //Start off with a character count of 0 for current word
  ovflow = 0;   //Start off not assuming any words > MAXWORD length

  // Start off with our counters of words at each length at zero
  for (i = 0; i < MAXWORD; i++) {
    wl[i] = 0;  
  }

  // Main loop to count characters in each 'word'
  // state keeps track of whether we are IN a word or OUTside of one
  // For each character in the input stream...
  //   - If it's whitespace, set our state to being OUTside of a word
  //     and, if we have a character count in nc (meaning we've just left
  //     a word), increment the counter in the wl (word length) array.
  //     For example, if we've just counted five characters, increment
  //     wl[5], to reflect that we now know there is one more word with 
  //     a length of five.  If we've exceeded the maximum word length,
  //     then increment our overflow counter.  Either way, since we're
  //     currently looking at a whitespace character, reset the character
  //     counter so that we can start counting characters with our next
  //     word. 
  //   - If we encounter something other than whitespace, and we were 
  //     until now OUTside of a word, change our state to being IN a word
  //     and start the character counter off at 1.
  //   - If we encounter something other than whitespace, and we are
  //     still in a word (not OUTside of a word), then just increment
  //     the character counter.
  while ((c = getchar()) != EOF) {
    if (c == ' ' || c == '\n' || c == '\t') {
      state = OUT;            
      if (nc > 0) {
        if (nc < MAXWORD) ++wl[nc];
        else ++ovflow;       
      }                       
      nc = 0;                 
    } else if (state == OUT) {
      state = IN;             
      nc = 1;                 
    } else {
      ++nc;
    }
  }

  // Find out which length has the most number of words in it by looping
  // through the word length array. 
  maxvalue = 0;
  for (i = 1; i < MAXWORD; ++i) {
    if(wl[i] > maxvalue) maxvalue = wl[i];       
  }

  // Print out our histogram
  for (i = 1; i < MAXWORD; ++i) {
    // Print the word length - then the number of words with that length
    printf("%5d - %5d : ", i, wl[i]);

    if (wl[i] > 0) {
      len = wl[i] * MAXHIST / maxvalue;
      if (len <= 0) len = 1;
    } else {
      len = 0;
    }

    // This is confusing and unnecessary.  It's integer division, with no
    // negative numbers.  What we want to have happen is that the length
    // of the bar will be 0 if wl[i] is zero; that the bar will have length
    // 1 if the bar is otherwise too small to represent; and that it will be
    // expressed as some fraction of MAXHIST otherwise. 
    //if(wl[i] > 0)
    //    {
    //        if((len = wl[i] * MAXHIST / maxvalue) <= 0)
    //            len = 1;
    //    }
    //    else
    //        len = 0;

    // Multiply MAXHIST (our histogram maximum length) times the relative 
    // fraction, i.e., we're using a histogram bar length of MAXHIST for
    // our statistical mode, and interpolating everything else. 
    len = ((double)wl[i] / maxvalue) * MAXHIST; 

    // Our one special case might be if maxvalue is huge, a word length
    // with just one occurrence might be rounded down to zero.  We can fix
    // that manually instead of using a weird logic structure.
    if ((len == 0) && (wl[i] > 0)) len = 1;

    while (len > 0) {
      putchar('*');
      --len;
    }

    putchar('\n');
  }

  // If any words exceeded the maximum word length, say how many there were.
  if (ovflow > 0) printf("There are %d words >= %d\n", ovflow, MAXWORD);

  return 0;
}

Why is this array being initialized in an odd way?

Question

3 answers

solution1
3 2018-12-06 03:07:02

solution2
1 2018-12-06 03:24:02

solution3
-1 2018-12-06 03:23:02

Why is this array being initialized in an odd way?

Question

3 answers

solution1 3 2018-12-06 03:07:02

solution2 1 2018-12-06 03:24:02

solution3 -1 2018-12-06 03:23:02

solution1
3 2018-12-06 03:07:02

solution2
1 2018-12-06 03:24:02

solution3
-1 2018-12-06 03:23:02