简体   繁体   中英

How to open a file of any length in C?

As a school assignment I'm tasked with writing a program that opens any text file and performs a number of operations on the text. The text must be loaded using a linked list, meaning an array of structs containing the char pointer and the pointer to the next struct. One line per struct.

But I'm having problems actually loading the file. It seems the memory required to load the text into memory must be allocated before I actually read the text. Hence I have to open the file several times. Once to count the number of lines, then twice per line; once to count the characters in the line then once to read them. It seems absurd to open a file hundreds of times just to read it into memory.

Obviously there are better ways of doing this, I just don't know them :-)

Examples

  • Can the point from which fgetc fetches a character be moved without re-opening the file?
  • Can the number of lines or characters in a file be checked before it is "opened"?
  • Can I somehow read a line or string from a file and save it to memory without allocating a fixed amount of bytes?
  • etc

There is no need to open the file more than once, nor to pass through it more than once.

Look at the POSIX getline() function. It reads lines into allocated space. You can use it to read the lines, and then copy the results for your linked list.

There is no need with a linked list to know how many lines there are ahead of time; that's an advantage of lists.

So, the code can be done with a single pass. Even if you can't use getline() , you can use fgets() and monitor whether it reads to end of line each time, and if it doesn't you can allocate (and reallocate) space to hold the line as needed ( malloc() , realloc() and eventually free() from <stdlib.h> ).

Your specific questions are largely immaterial if you adopt anything of the approach I suggest, but:

  • Using fseek() (and in extremis rewind() ) will move the read pointer (for fgetc() and all other functions), unless the 'file' does not support seeking (eg, a pipe provided as standard input).

  • Characters can be determined with stat() or fstat() or variants. Lines cannot be determined except by reading the file.

  • Since the file could be from zero bytes to gigabytes in size, there isn't a sensible way of doing fixed size allocations. You are pretty much forced into dynamic memory allocation with malloc() et al. (Behind the scenes, getline() uses malloc() and realloc() .)

You cannot count the number of lines in a file without actually traversing it. You could get the total file size, but that's not whats intended here. The idea of using a linked list of lines is that you operate on the file one line at a time. You do not need to read anything in advance. While you haven't read the whole file, read a line, add it to its own node at the end of the linked list, move to the next line.

Regarding your first question: you can change the position in the file you are reading from with the fseek() function.

There are several ways you could do this. For example, you could have a fixed-size buffer, fill it with bytes from the file, copy lines from the buffer to the list, fill the buffer again and so on.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM