简体   繁体   中英

'\0' character in a while loop at c

I want to make a function which takes an array full of strings and counts the words. I don't think that my code in the pic is wrong. Everytime a blank appears it is counted. BUT when a '\\0' character shows up, the while loop doesn't do anything. Is something I don't know?

int Number(char w[][20]) {
    int i, counter, j;
    counter = 0;
    for (i = 0; i < 4; i++) {
        j = 0;
        do {
            if ((w[i][j] == '\0') || (w[i][j] == ' '))
                ++counter;
            j++;
        } while (w[i][j] != '\0');
        printf("counter=%d\n", counter);
    }
}

Here's a working version of your code, and a test program.

#include <stdbool.h>
#include <stdio.h>

static int wc(const char *str)
{
    int count = 0;
    bool inword = false;

    char c;
    while ((c = *str++) != '\0')
    {
        if (c == ' ')
            inword = false;
        else
        {
            if (inword == false)
                count++;
            inword = true;
        }
    }
    return count;
}

static void Number(const char *tests[], int num_tests)
{
    for (int i = 0; i < num_tests; i++)
        printf("%d: [%s]\n", wc(tests[i]), tests[i]);
}

int main(void)
{
    const char *tests[] =
    {
        "",
        " ",
        "  ",
        "a",
        "a b",
        " a b ",
        "  ab  cd  ",
        "The quick brown  fox jumps  over   the  lazy     dog.",
        "      The quick brown  fox jumps  over   the  lazy     dog.    ",
    };
    enum { NUM_TESTS = sizeof(tests) / sizeof(tests[0]) };
    Number(tests, NUM_TESTS);
    return 0;
}

Note that your Number() function does two jobs — and should only do one, delegating the other to a separate function. It both counts the words in a single string and prints related information. I delegate the word counting to a separate function wc() , which greatly simplifies the code in Number() — almost to the point that the function isn't needed. Note, too, that my version of Number() is told the number of entries the array it is working on, rather than relying on a magic number like 4 . Note that the output from my code allows you to check it for accuracy. Simply printing the output number doesn't allow you to check for accuracy so easily; you have to look at the code to see what the numbers mean. Note that your Number() function is defined to return an int but doesn't actually do so. This version is defined not to return anything, and it doesn't.

The output from the code is:

0: []
0: [ ]
0: [  ]
1: [a]
2: [a b]
2: [ a b ]
2: [  ab  cd  ]
9: [The quick brown  fox jumps  over   the  lazy     dog.]
9: [      The quick brown  fox jumps  over   the  lazy     dog.    ]

Obviously, you could refine the space testing using the isblank() or isspace() macros (functions) from <ctype.h> if you wished, or define the boundary between word and not-word in other ways. The basic concept is reliable, though, across fairly perverse sequences of blanks and words.

If you really want a 2D array of characters, it is not hard to write the code to work with that, though the 'lazy dog' strings will have to be reduced to fit into char data[][20] cleanly. The basic ideas remain the same — and the wc() function doesn't change.

#include <stdbool.h>
#include <stdio.h>

static int wc(const char *str)
{
    int count = 0;
    bool inword = false;

    char c;
    while ((c = *str++) != '\0')
    {
        if (c == ' ')
            inword = false;
        else
        {
            if (inword == false)
                count++;
            inword = true;
        }
    }
    return count;
}

static void Number(const char tests[][20], int num_tests)
{
    for (int i = 0; i < num_tests; i++)
        printf("%d: [%s]\n", wc(tests[i]), tests[i]);
}

int main(void)
{
    const char tests[][20] =
    {
        "",
        " ",
        "  ",
        "a",
        "a b",
        " a b ",
        "  ab  cd  ",
        "The quick brown fox",
        "  jumps   over     ",
        "  the  lazy  dog   ",
    };
    enum { NUM_TESTS = sizeof(tests) / sizeof(tests[0]) };
    Number(tests, NUM_TESTS);
    return 0;
}

Output:

0: []
0: [ ]
0: [  ]
1: [a]
2: [a b]
2: [ a b ]
2: [  ab  cd  ]
4: [The quick brown fox]
2: [  jumps   over     ]
3: [  the  lazy  dog   ]

Tests like the " ab cd " example (with double blanks at the start, middle and end) are often very good ones for pushing edge cases — in more contexts than just word counting. Many a shell script will not correctly handle arguments like that string, for example.

 - The first principle of programming is: divide and conquer
 - The second is: use  a loop to iterate
So:

unsigned count_words(char*str)
{
unsigned len, words;

for(len=words=0; *str; str++){
        if(*str == ' ') if(len) {words++; len=0;}
        else len += 1;
        }

if (len) {words++; }
return words;
}

Now, call this function for any of your strings, adding the results.

There are some problems in your code:

  • if a line does not have characters (starts with a '\\0' ), you increment j before testing for the end of string. Do not use do / while loops, they are error prone, and this bug is a classic one.
  • you merely count the number of spaces: this is not the same as the number of words because there could be multiple spaces between 2 words or at the start or end of the line etc. You should instead count the number of transitions from space to non-space, starting from space.

Here is a simple implementation for your 2D array:

int Number(char w[][20]) {
    int i, j, counter = 0;
    for (i = 0; i < 4; i++) {
        char c, last = ' ';
        for (j = 0; (c = w[i][j]) != '\0'; j++) {
            if (last == ' ' && c != ' ')
                counter++;
            last = c;
        }
    }
    printf("counter=%d\n", counter);
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM