简体   繁体   中英

fseek only working with fread call after rather than read?

I open a file:

FILE *fp = fopen("hello_world.txt", "rb");

which just has the contents Hello World!

Then I get the size and reset to the beginning:

fseek(fp, 0L, SEEK_END);
size_t sz = ftell(fp);
fseek(fp, 0L, SEEK_SET);

When I go to perform a read , it does not seem to work. read(fileno(fp), buffer, 100) returns 0 .

However, if I instead do;

fread(buffer, 100, 1, fp)

This does indeed read into the buffer correctly.

Even stranger, if I change the offset for the first fseek call to 1 , it works completely fine (despite being past the end of file). I'm wondering why this is happening. My initial thought would be that it has to do with clearing the EOF flag, but I thought that should at least be reset when doing fseek back to the start. Not sure why fread works though. It looks like I'm invoking some sort of undefined behavior since some things are varying when running on different machines but I have no idea why.

Here's an MCVE:

#include <stdio.h>
#include <unistd.h>

int main() {
     FILE *fp = fopen("hello_world.txt", "rb");
     fseek(fp, 0L, SEEK_END); // works fine if offset is 1, but read doesn't get any bytes if offset is 0
     size_t sz = ftell(fp);
     fseek(fp, 0L, SEEK_SET);
     char buffer[100];
     size_t chars_read = read(fileno(fp), buffer, 100);
     printf("Buffer: %s, chars: %lu", buffer, chars_read);
     fclose(fp);
     return 0;
 }

The problem is subtile, but it boils down to:

Do not mix stream level input/output and positioning calls with low level system calls on the underlying system handle.

Here is a potential explanation of the actual problem:

  • fseek(fp, 0L, SEEK_END); uses a system call lseek(fileno(fp), 0L, 2); to determine the length of the file associated with the system handle. The length returned by the system is 12 , smaller than the stream buffer size, fseek() resets the system handle position and reads the 12 bytes into the buffer, leaving the system handle position at 12 , sets the stream's internal file position at 12.
  • ftell(fp); returns the stream's internal file position, 12. It does so because the stream is opened in binary mode, which is not recommended for text files because end of line sequences will not be translated into newline characters '\\n' on legacy systems).
  • fseek(fp, 0L, SEEK_SET); sets the stream's internal file position to 0 , which is inside the currently buffered contents, do it does not issue an lseek() system call.
  • read(fileno(fp), buffer, 100); cannot read anything because the current position for the system handle is at 12, the end of file.
  • fread(buffer, 100, 1, fp) would read the file contents from the buffer, 12 bytes, try and read more contents from the file, none is available, and return the number of characters read, 12.

Conversely, here is what happens if you pass 1 to fseek() :

  • fseek(fp, 1L, SEEK_END); uses a system call lseek(fileno(fp), 0L, 2); to determine the length of the file associated with the system handle. The length returned by the system is 12 , hence the requested position is 13, smaller than the stream buffer size, fseek() resets the system handle position and tries to read the 13 bytes from the file into the stream buffer but only 12 bytes are available from the file. fseek clears the buffer and issues a system call lseek(fileno(fp), 1L, 2); and keeps track of the stream internal file position as 13.
  • ftell(fp); returns the stream internal file position, which is 13 .
  • fseek(fp, 0L, SEEK_SET); resets the internal file position to 0 , and issues a system call lseek(fileno(fp), 0L, 0); because the position was outside the current stream buffer.
  • read(fileno(fp), buffer, 100); reads the file contents from the system handle current position, which is also 0 , hence behaves as expected.

Notes:

  • This behavior is not guaranteed as the C Standard does not specify the implementation of the stream functions, but it is consistent with the observed behavior.
  • You should check the return values of fseek() and ftell() for failure.
  • Also use %zu for size_t arguments.
  • buffer is not necessarily null terminated, do not use %s to print its contents with printf , use %.*s and pass (int)chars_read as the precision value.

Here is an instrumented version:

#include <stdio.h>
#include <unistd.h>

#ifndef fileno
extern int fileno(FILE *fp); // in case fileno is not declared
#endif

int main() {
    FILE *fp = fopen("hello_world.txt", "rb");
    if (fp) {
        fseek(fp, 0L, SEEK_END);
        long sz = ftell(fp);
        fseek(fp, 0L, SEEK_SET);
        char buffer[100];
        ssize_t chars_read = read(fileno(fp), buffer, 100);
        printf("\nread(fileno(fp), buffer, 100) = %zd, Buffer: '%.*s', sz = %zu\n",
               chars_read, (int)chars_read, buffer, sz);
        fclose(fp);
    }
    fp = fopen("hello_world.txt", "rb");
    if (fp) {
        fseek(fp, 1L, SEEK_END);
        long sz = ftell(fp);
        fseek(fp, 0L, SEEK_SET);
        char buffer[100];
        ssize_t chars_read = read(fileno(fp), buffer, 100);
        printf("\nread(fileno(fp), buffer, 100) = %zd, Buffer: '%.*s', sz = %zu\n",
               chars_read, (int)chars_read, buffer, sz);
        fclose(fp);
    }
    return 0;
}

Here is a trace of the system calls on linux consistent with my tentative explanation: the file hello_world.txt contains Hello world! without a newline, 12 bytes total:

chqrlie$ strace ./rb612-1
...
<removed system calls related to program startup>
...
open("hello_world.txt", O_RDONLY)       = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=12, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5e356ed000
fstat(3, {st_mode=S_IFREG|0644, st_size=12, ...}) = 0
lseek(3, 0, SEEK_SET)                   = 0
read(3, "Hello world!", 12)             = 12
lseek(3, 12, SEEK_SET)                  = 12
read(3, "", 100)                        = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5e356ec000
write(1, "\n", 1
)                       = 1
write(1, "read(fileno(fp), buffer, 100) = "..., 55read(fileno(fp), buffer, 100) = 0, Buffer: '', sz = 12
) = 55
close(3)                                = 0
munmap(0x7f5e356ed000, 4096)            = 0
open("hello_world.txt", O_RDONLY)       = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=12, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5e356ed000
fstat(3, {st_mode=S_IFREG|0644, st_size=12, ...}) = 0
lseek(3, 0, SEEK_SET)                   = 0
read(3, "Hello world!", 13)             = 12
lseek(3, 1, SEEK_CUR)                   = 13
lseek(3, 0, SEEK_SET)                   = 0
read(3, "Hello world!", 100)            = 12
write(1, "\n", 1
)                       = 1
write(1, "read(fileno(fp), buffer, 100) = "..., 68read(fileno(fp), buffer, 100) = 12, Buffer: 'Hello world!', sz =
) = 68
close(3)                                = 0
munmap(0x7f5e356ed000, 4096)            = 0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM