简体   繁体   中英

RAM consumption on opening a file

I have a Binary file of ~400MB which I want to convert to CSV format. The output CSV file will be ~1GB (according to my calculations).

I read the binary file and store it in an array of structures (required for other processing too), and when the user wants to export it to CSV, I am creating a file (or opening an existing file - depending on the user's choice), opening it using fopen and then writing to it using fwrite , line by line. Coming to my question, this link from CPlusPlus.com says:

The returned stream is fully buffered by default if it is known to not refer to an interactive device

My query is when I open this file, will it be loaded in RAM? Like when at the end, my file is of ~1GB, will it consume that much RAM or will it be just on the hard disk?

This code will run on Windows as well as Android.

FILE* streams buffering is a C feature and it is used to reduce system call overhead (ie do not call read() for each fgetc() which is expensive). Usually buffer is small - ie 512 bytes.

Page Cache or similiar mechanisms are different beasts -- they are used to reduce number of disks operations. Usually operating system uses free memory to cache previously read or written data to/from disk so subsequent operations will use RAM.

If there are shortage of free memory -- data is evicted from page cache.

It is operating system and file system and computer specific. And it might not matter that much. Read about page cache .

BTW, you might be interested by sqlite

From an application writer point of view, you should care more about virtual memory and address space of your process than about RAM. Physical RAM is managed by the operating system.

On Linux and Android, if you want to optimize that you might consider (later) using posix_fadvise(2) and perhaps madvise(2) . I'm not sure it is worth the pain in your case (since a gigabyte file is not that much today).

I read the binary file and store it in an array of structures (required for other processing too), and when the user wants to export it to CSV

Reading per se doesn't use a lot of memory, like myaut says the buffer is small. The elephant in the room here is: do you you read up all the file and put all the data into structures? or do you start processing after one or few reads to get the minimum amount of data needed to do some processing? Doing the former will indeed use ~400MB or more memory, doing the later will use quite a lot less, that being said, it all depends on the amount of data needed to start processing, and maybe you need all the data loaded at once.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM