简体   繁体   中英

Read from stdin and fill buffer until EOF

I need to read from stdin and fill a buffer of _SC_PAGESIZE (from sysconf()) until stdin is at EOF. This program is supposed to be a wc clone, so I would be expecting something like the contents of a regular file to be passed in. If the buffer isn't big enough for stdin, then I have to keep filling it, process it for information, then clear it and continue to fill the buffer again from the file offset in stdin. I'm just having a problem with tracking the EOF of stdin, and I'm getting an infinite loop. Here's what I have:

int pSize = sysconf(_SC_PAGESIZE);
char *buf = calloc(pSize, sizeof(char));
assert(buf);
if (argc < 2) {
        int fd;
        while (!feof(stdin)) {
                fd = read(0, buf, pSize);
                if (fd == -1)
                        err_sys("Error reading from file\n");
                lseek(0, pSize, SEEK_CUR);
                if (fd == -1)
                        err_sys("Error reading from file\n");
                processBuffer(buf);
                buf = calloc(pSize, sizeof(char));
        }
        close(fd);
}

I'm assuming the problem has to do with the test condition (while (!feof(stdin)), so I guess what I need is a correct test condition to exit the loop.

You can write the loop like

int n;
do {
    n = read(0, buf, pSize);
    // process it
} while(n > 0);

Remember EOF is just one exit condition that may not occur before any other error condition occurs. True check for validity to run the loop is a healthy return code from read . Also, note that condition while(n > 0) is enough or not depends on where you are reading from. In case of stdin it may be enough. But for example for sockets the condition can be written like while(n > 0 || errno == EAGAIN)

Why are you using a low-level read instead of opening a FILE *stream and using fgets (or POSIX getline )? Further, you leak memory every time you call:

            buf = calloc(pSize, sizeof(char));

in your loop because you overwrite the address contained in buf losing the reference to the previous block of memory making it impossible to free .

Instead, allocate your buffer once, then continually fill the buffer passing the filled buffer to processBuffer . You can even use a ternary operator to determine whether to open a file or just read from stdin , eg

int pSize = sysconf(_SC_PAGESIZE);
char *buf = calloc(pSize, sizeof(char));
assert(buf);

FILE *fp = argc > 1 ? fopen (argv[1], "r") : stdin;
if (!fp) {
    perror ("fopen failed");
    return 1;
}

while (fgets (buf, pSize, fp))
    processBuffer(buf);     /* do not call calloc again -- memory leak */

if (fp != stdin) fclose (fp);   /* close file if not stdin */

( note: since fgets will read a line-at-a-time, you can simply count the number of iterations to obtain your line count -- provided your lines are not longer than _SC_PAGESIZE )

If you want to use exact pSize chunks, then you can use fread instead of fgets . The only effect would be to reduce the number of calls to processBuffer marginally, but it is completely up to you. The only thing that you would need to do is change the while (...) line to:

while (fread (buf, (size_t)pSize, 1, fp) == 1)
    processBuffer(buf);     /* do not call calloc again -- memory leak */

if (ferror(fp))     /* you can test ferror to insure loop exited on EOF */
    perror ("fread ended in error");

( note: like read , fread does not insure a nul-terminated string in buf , so insure that processBuffer does not pass buf to a function expecting a string, or iterate over buf expecting to find a nul-terminating character at the end.)

Look things over and let me know if you have further questions.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM