简体   繁体   English

此程序的getchar()行为正确吗?

[英]Is the behavior of getchar() correct for this program?

The following code is showing strange behaviour. 以下代码显示了奇怪的行为。 While giving the input if I press a newline then only it prints the histogram value otherwise if I directly enter EOF(^Z), it shows all zeros. 当我按换行符输入时,只有它会打印直方图值,否则,如果我直接输入EOF(^ Z),它将显示全零。 Is there a problem with getchar() function that it takes the input only when newline is pressed. getchar()函数是否有问题,即仅在按换行符时才接受输入。

#include <stdio.h>
#define IN 1 /* inside a word */
#define OUT 0 /* outside a word */
#define MAXLEN 50
/* count lines, words, and characters in input */
main()
{
    int c, i, j, nc, state;
    int wordlength[MAXLEN];
    state = OUT;
    nc = 0;
    for (i = 0; i < MAXLEN; i++)
        wordlength[i] = 0;
    while ((c = getchar()) != EOF) {
        if (c == ' ' || c == '\n' || c == '\t') {
            if (state == IN) {
                wordlength[nc-1]++;
            }
            state = OUT;

        }
        else if (state == OUT) {
            //putchar('\n');
            state = IN;
            nc = 0;
        }
        if (state == IN)    {
            ++nc;
        }
    }

    for (j = 0; j < MAXLEN; j++)
            printf("\n%d - %d",j,wordlength[j]);

    for (i = 10; i >= 0; i--) {
        for (j = 0; j < MAXLEN; j++)
            printf(((wordlength[j] > i)?"|":" "));
        printf("\n");

    }

}

Your code works more or less sanely for me unless I type a single word of input not followed by any white space (blank, tab, newline) before indicating EOF ( Control-D on my machine; if you use Control-Z , it suggests you are running on Windows). 您的代码或多或少对我有效,除非在输入EOF(在我的机器上为Control-D ;如果您使用Control-Z ,然后输入)之前输入单个输入单词,不带空格(空白,制表符,换行符)您正在Windows上运行)。 If you indicate EOF without a final white space, the last word is not added to the histogram. 如果您表示EOF没有最后的空格,则最后一个单词不会添加到直方图中。 You should, of course, also check that the word length is not too big so that you do not index outside the wordlength array ( if (nc > MAXSIZE) nc = MAXSIZE; to count all the very long words as the same size). 你应该,当然,还需要检查字长不是太大,让你不索引外wordlength阵列( if (nc > MAXSIZE) nc = MAXSIZE;计算所有的很长的话一样大小)。

After the main processing loop, you should check whether nc > 0 and if so, increment the appropriate entry in wordlength . 在主处理循环之后,应检查nc > 0 ,如果是,则在wordlength增加相应的条目。

Consider using isspace() from <ctype.h> , too. 也考虑使用<ctype.h> isspace()

I use enum instead of #define whenever possible so that the symbols are available in the debugger. 我尽可能使用enum代替#define以便在调试器中可以使用这些符号。 You carefully avoided one common mistake; 您仔细地避免了一个常见错误; you made the variable c into an int , not a char . 您将变量cint而不是char

#include <stdio.h>

enum { IN =  1, OUT = 0 };  /* inside, outside a word */
enum { MAXLEN = 50 };

/* count lines, words, and characters in input */
int main(void)
{
    int c, i, j, nc, state;
    int wordlength[MAXLEN];
    state = OUT;
    nc = 0;

    for (i = 0; i < MAXLEN; i++)
        wordlength[i] = 0;

    while ((c = getchar()) != EOF) 
    {
        if (c == ' ' || c == '\n' || c == '\t') 
        {
            if (state == IN) 
            {
                if (nc > MAXLEN)
                    nc = MAXLEN;    /* All long words grouped together */
                wordlength[nc-1]++;
            }
            state = OUT;
        }
        else if (state == OUT) 
        {
            state = IN;
            nc = 0;
        }
        if (state == IN)
            ++nc;
    }

    if (nc > 0)
    {
        if (nc > MAXLEN)
            nc = MAXLEN;    /* All long words grouped together */
        wordlength[nc-1]++;
    }

    for (j = 0; j < MAXLEN; j++)
        printf("\n%d - %d", j, wordlength[j]);

    for (i = 10; i >= 0; i--) 
    {
        for (j = 0; j < MAXLEN; j++)
            putchar( (wordlength[j] > i) ? '|' : ' ');
        printf("\n");
    }
    return 0;
}

You said you were having problems with your machine. 您说您的机器有问题。 I'd be very cautious about claiming to find a bug in the system, especially in such an obvious call as getchar() . 我会非常谨慎地声称要在系统中发现一个错误,尤其是在诸如getchar()这样明显的调用中。 I can't rule out the possibility, but that would be the last thing I'd think of blaming. 我不能排除这种可能性,但这是我想到的最后一件事。 I'd spend a lot of time working out what I've done wrong to break things before thinking there's a bug in getchar() . 在认为getchar()有错误之前,我会花很多时间来弄清楚我做错了什么来破坏事物。


In the comments, you ask to be told why your program is not working in your environment. 在评论中,您要求被告知为什么您的程序无法在您的环境中运行。 Since you've not (yet) formally identified the platform/environment where you are running your program, this is not possible. 由于您尚未(正式)正式标识正在运行程序的平台/环境,因此这是不可能的。

However, I have demonstrated that your original as-posted program works reasonably sanely on a Unix-like environment (I'm testing on MacOS X 10.7.2, but it would work the same for any other similar Unix-like system). 但是,我已经证明了原始的发布程序在类似Unix的环境中可以合理合理地工作(我正在MacOS X 10.7.2上进行测试,但对于其他任何类似的类似Unix的系统也可以使用)。 The revised version works slightly better; 修改后的版本效果更好; it will count the last word entered even if it is not followed by a space or newline. 即使没有空格或换行符,它也会计算输入的最后一个单词。

If, as inferred, you are working on Windows, then the terminal I/O model may be different. 如果推断出您在Windows上工作,则终端I / O模型可能会有所不同。 In particular, the C standard requires that text files (perhaps including terminal input) must end with a newline before the EOF; 特别是,C标准要求文本文件(可能包括终端输入)必须在EOF之前以换行符结尾; any characters after the last newline may be discarded but that is platform dependent. 最后一个换行符之后的任何字符都可以被丢弃,但这取决于平台。 The behaviour for binary files is different. 二进制文件的行为是不同的。 If the data after the last newline, that would be consistent with the behaviour you are reporting. 如果数据在最后一个换行符之后,则与您报告的行为一致。 It may well be the expected behaviour - if you look at the documentation for your unidentified system. 这很可能是预期的行为-如果您查看未识别系统的文档。 This is one of the areas of differences between implementations identified by PJ Plauger in his excellent (but somewhat dated) 'The Standard C Library'. 这是PJ Plauger出色(但有些过时)的“标准C库”中确定的实现之间的区别之一。

However, if what I'm hypothesizing is correct, then I still wish to make it clear that your code is correct (enough); 但是,如果我的假设是正确的,那么我仍然希望清楚地表明您的代码是正确的(足够); the trouble is simply that your expectations don't match the documented behaviour of your system. 问题仅在于您的期望与系统记录的行为不符。 Note that reporting the platform on which you are working is sometimes crucial. 请注意,报告工作平台有时至关重要。 It tends to be more crucial as you are encroaching on edge cases. 当您侵占边缘情况时,它往往更为关键。 And it still is extremely unlikely that you've hit on a bug in getchar() . 而且您仍然极不可能碰到getchar()的错误。

Incidentally, when I was testing, I needed to type Control-D twice (and that was what I was expecting to have to do). 顺便说一句,当我进行测试时,我需要键入Control-D两次(这正是我期望要做的)。 The first time flushed the characters that I'd entered on the line ( abc ) to the program as a 3-byte read; 第一次将我在行( abc )上输入的字符刷新为3字节; the second also flushed the characters that I'd entered (all zero of them) to the program as a 0-byte read which was then interpreted as EOF by getchar() . 第二个也将我输入的字符(全部为零)刷新为程序的0字节读取,然后由getchar()解释为EOF。 I also tested with abc (a blank at the end), and then the EOF. 我还测试了abc (末尾为空白),然后测试了EOF。 Your code did not count the abc without a blank; 您的代码没有不带空格的abc计数; it did count the abc when it was followed by a blank. 当它后面有一个空格时,它确实计算了abc

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM