简体   繁体   English

C语言中的单词计数程序的数量超过了应有的

[英]word counting program in c counts more than it should

OK guys, so I wrote this program 好的,我写了这个程序

#include <stdio.h>

/* count words */

main ()
{

    int c, c2;
    long count = 0;

    while ((c = getchar()) != EOF)
    {
        switch(c)
        {
        case ' ':
        case '\n':
        case '\t':
            switch(c2)
            {
            case ' ':
            case '\n':
            case '\t':
                break;
            default:
                ++count;
            }
        }
        c2 = c;
    }
    printf("Word count: %ld\n", count);
}

It counts words from an input, as you can see. 如您所见,它对输入中的单词进行计数。 So i wrote a file called a-text that only has 所以我写了一个叫做a-text的文件

a text

and i wrote in the ubuntu prompt 我在ubuntu提示中写道

./cw < a-text

and it wrote 它写道

Word count: 2

So, what the heck? 那么,到底是什么? Shouldn't it just count 1, because after the second word there's no tab nor new line nor space, only EOF. 它不应该只算1,因为在第二个单词之后没有制表符,换行符和空格,只有EOF。 Why does this happen? 为什么会这样?

Why not count words rather than spaces? 为什么不计算单词而不是空格? Then you don't have a problem when the input ends with space. 这样,当输入以空格结尾时,您就不会有问题。

#include <ctype.h>
#include <stdio.h>

int main(int argc, char**argv) {
    int was_space = 1;
    int c;
    int count = 0;
    while ((c = getchar()) != EOF) {
        count += !isspace(c) && was_space;
        was_space = isspace(c);
    }
    printf("Word count: %d\n", count);
    return 0;
}

lets see what happens with "a text" 让我们看看“文本”会发生什么

after the first iteration, c2 == 'a', count remains 0
now comes c == ' ' c2 is still 'a', so count == 1, c2 becomes == ' '
now comes c == 't' c2 is still ' '. so count remains == 1
...
now comes c == '\n' c2 is the last 't'. count becomes == 2

IOW OW

"a text\n"
  ^----^-------- count == 1
       |
       +-------- count == 2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM