简体   繁体   中英

C File Handling and end of the line character in Notepad

I was reading C File Handling and I ran into fseek() function. The following statement was written in the book, regarding the use of it:

int fseek(FILE *stream, long offset, int from );  //The function prototype

"On binary streams, seeks from SEEK_END may not be supported an should therefore be avoided. On text streams, the offset must be zero if from is either SEEK_CUR or SEEK_END . The offset must be a value previously returned from a call to to ftell() on the same stream if from is SEEK_SET ."

I don't understand the given usage.Why the offset should be zero ?

To find the answer, I investigated and I found out that, in text streams, there is mapping of EOL(newline) from C program to different character in MSDOS. The size of newline character in C is 1 byte.
What happens when it gets written to a Notepad file?. What is the size of EOL in notepad?
I created a notepad file and did the following:

Scenario 1:
abcd

The size shown was 4 bytes. Where is newline or EOF now?

Scenario 2:
abcd
a

The size shown was 7 bytes.

Scenario 3:
abcd
a
b

The size shown was 10 bytes. How was the size calculated now?

Can anybody answer these questions?

I assume notepad adds windows like end of line which consists of \\r\\n hence the answer to your question is 2 , and that obviously explains the observed behavior.

Also, EOF is not a character written to the file, it's a special value returned by some funcitions to indicate the end of the file.

Notepad uses DOS / Windows sequence of \\r\\n to mark end-of-line.

  • First example has no end-of-line sequence, so 4 printable characters produce size of 4: abcd
  • Second example has one end-of-line sequence, so 5 printable characters plus 2 for the end-of-line marker make its size 7: abcd \\r \\na
  • Third example has two end-of-line sequences, so 6 printable + 4 for two end-of-line markers make its size 10: abcd \\r \\na \\r \\nb

You can write a small C program to read your files character-by-character, and print them as numbers. \\r 's code is 13; \\n 's code is 10.

Where is the EOF character? Is it not counted in for size calculation?

EOF is not a character, it is a value returned by I/O functions to indicate that the end of input has been reached. It has a special numeric value which is different from numeric values of all other characters. In fact, the reason the functions from getchar family return int , not char , is to accommodate returning EOF marker, instead of reserving one of the character codes for it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM