简体   繁体   中英

K&R: Chapter 6 - Why getword() function does not read EOF?

This is my very first post on Stack Overflow, so I hope I don't step on anyone's toes.

Of course, all inputs are welcome and appreciated, but those most suited to answer would have actually read the book, C Programming Language, 2nd ed.

I have just finished coding Exercise 6-4, but I cannot seem to figure something out. Why does the getword() function not read EOF until I press Ctrl+D (I code in C in an Arch Linux VM)?

Many of my previous exercises from the book require reading from stdin. One way I would do it is via something like

while ((c = getchar()) != EOF) {...}

In such an instance, I never have to press Ctrl+D. I enter in my input, press Enter, the stdin buffer gets flushed out, and EOF is detected automatically. The getword() function also relies on getchar() at its base, so why does it hang my program?

The getword() function is called from main():

while (getword(word, MAX_WORD) != EOF) {
    if (isalpha(word[0])) {
        root = addtree(root, word);
    }
}

The getword() function itself:

int getword(char *word, int lim) {

    char *w = word;
    int c;

    while (isspace(c = getch())) {
    }
    if (c != EOF) {
        *w++ = c;
    }
    // This point is reached
    if (!isalpha(c)) {
        // This point is never reached before Ctrl+D
        *w = '\0';
        return c;
    }
    for ( ; --lim > 0; w++) {
        if (!isalnum(*w = getch())) {
            ungetch(*w);
            break;
        }
    }
    *w = '\0';
    return word[0];
}

I put comments to indicate the point where I determined that EOF is not being read.

The getch() and ungetch() functions are the same ones used in the Polish notation calculator from Chapter 4 (and that program was able to read EOF automatically - by pressing Enter):

#define BUF_SIZE 100

char buf[BUF_SIZE];
int bufp = 0;

int getch(void) {

    return (bufp > 0) ? buf[--bufp] : getchar();
}

void ungetch(int c) {

    if (bufp >= BUF_SIZE) {
        printf("ungetch: too many characters\n");
    }
    else {
        buf[bufp++] = c;
    }
}

Thus far, this is the first program I wrote since the beginning of this book that requires me to manually enter the EOF via Ctrl+D. I just can't seem to figure out why.

Much appreciation in advance for explanations...

Having to type Ctrl+D to get EOF is the normal behavior for Unix-like systems.

For your code snippet:

while ((c = getchar()) != EOF) {...}

pressing Enter definitely shouldn't terminate the loop (unless your tty settings are badly messed up).

Try compiling and running this program:

#include <stdio.h>
int main( void )
{
    int c;
    while ((c = getchar()) != EOF) {
        putchar(c);
    }
    return 0;
}

It should print everything you type, and it should terminate only when you type control-D at the beginning of a line (or when you kill it with control-C).

The 'not reached' point would only be reached if you did something like type a punctuation mark in the input - or you read EOF. If you type a letter, or spaces, then it is bypassed.

When input is coming from a terminal (standard input), then EOF is not detected until you type Control-D (or whatever is specified in the stty -a output) after you enter a newline, or after you hit another Control-D (so two in a row). The code reads through newlines because the newline character '\\n' satisfies isspace() .

The source of my confusion vis-a-vis my previous programs was that the effect of my previous programs were always printed to stdout inside the while loop, so I always immediately saw the result without needing to feed in EOF. For this one, the tree is not printed until after the while loop ends, so the EOF encounter was needed. I failed to recognize that, and that's why I was going insane.

Thanks again for setting me straight!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM