简体   繁体   中英

Read file after EOF

Is possible to read a file after its EOF?

I am reading a file which could contain an EOF character before its ending or multiple EOF characters. The file is a simple txt, and I am able to know the number of characters using fsize but looks like getc returns EOF (or -1) from the EOF to the end of the file.

int c = 0;
char x;
FILE *file = fopen("MyTextFile.txt", "r");
off_t size = fsize("MyTextFile.txt");

while (c < size) {
    x = getc(file);
    if (x != -1)
        printf("%c ", x);
    else
        printf("\nFOUND EOF!\n");
    c++;
}
fclose(file);

Unfortunately, even if I'm sure the file content continues after the EOF I cannot read the rest.

SOLVED: Reading using "rb" instead of "r" and using x as int allowed me to read the whole file, including multiple EOF. Not sure if it's a trick or if it's something allowed, but works.

Logically, there is no data after EOF (end of file).

Note that EOF is not a character; it's a special value returned by getc() after an end-of-file or error condition has been encountered, a value returned instead of a character value.

You haven't said so in the question, but my guess is that you have a Windows text file with one or more embedded Ctrl-Z ( 0x1a ) characters. That's the only thing I can think of that's consistent with your description.

In Windows, a Ctrl-Z character in a text file is treated as the end of the file. (This goes back to earlier systems where the end of the data was not clearly marked, because the file system only recorded the number of blocks.) Ctrl-Z is not an EOF character; it's a character value that, on Windows, triggers and end-of-file condition and causes getc() to return EOF .

Basically you have a malformed text file, and you should probably just fix it and/or fix whatever generated it. But if you really need to read data from it, I suggest opening it in binary mode rather than text mode. You'll then see each CR/LF end-of-line marker as two characters ( '\\r' , '\\n' rather than just '\\n' ), and Ctrl-Z ( 0x1a ) is just another byte value. Since you're not really treating the file as text (the "text" ends at the first Ctrl-Z), it makes sense to read it in binary mode.

There are probably tricks you can play to read past the Ctrl-Z in text mode; for example clearerr() is likely to work. But doing that goes beyond what the C standard guarantees -- which may or may not be a problem for you.

Also, you should definitely use the symbol EOF , not the "magic number" -1 . It's not even guaranteed that EOF == -1 , and using the symbol EOF will make your code much clearer.

Finally, thanks to Mark Plotnick's for pointing out in a comment something I should have noticed myself. getc() returns an int result; you're assigning it to a char object. x needs to be of type int , not char . This is necessary so you can distinguish between the value of EOF and the value of any actual character.

Your code is incomplete so it's hard to say what the problem is, but I would suggest:

  1. Make sure you are opening the file in binary mode "rb"
  2. Make sure x is of type int

Chapter and verse :

7.21 Input/output <stdio.h>

7.21.1 Introduction
...
3 The macros are...

EOF

which expands to an integer constant expression, with type int and a negative value, that is returned by several functions to indicate end-of-file , that is, no more input from a stream;

EOF isn't a character in the file itself; it's a value returned by the input function to indicate that there is no more input available on the stream; you can't read past it, because there's nothing to read.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM