简体   繁体   中英

Best way to read/write file in multithreaded environment (C++)

i have a multithreaded program that reads and writes files. One thread receives data and writes them in a file. Every 250 Mb of data, a new file is created. Multiple other threads can read into these files to retrieve data. I'm using C++ std file stream.

To prevent problems, my current implementation uses two file descriptors for the same file: one for readers and one for the writer. A mutex protects from multiple access at the same time, and the file descriptor position is moved each time the mutex owner needs it.

I really need to be able to read in the file as fast as possible, and the mutex doesn't really help me.

Firstly, I would like to know if it's safe to read and write the file or have multiple reads at the same time (on every platform). Secondly, if yes, I would like to know how it is safe for the hardware like the "Disk read-and-write head" for a HDD. The software works on the disk all the time to save data, and i don't want my algorithm to decrease too much the hard disk life time (already short).

Thank you for your help

There is no problem regarding multiple threads reading the same file.

Now, if I understood your description correctly, you do not modify already-written data, you just continuously append data to your file until it reaches 250Mb, then you continue writing on a new file.

If this is the case, you may not need a mutex at all. For instance, you might be able to keep your whole "file" into memory until it reaches 250mb, and only then you would write it all to disk, so you know that any files already on disk aren't going to be written anymore and can be read freely with no worries. As for the file that is still being written, you can have a global integer that holds how many bytes (or strings or whatever you use) have already been written, and reading-threads are limited by this integer, which does not need a lock, as long as you only update the integer after you have already written the data. (since you said there is only 1 thread writing data).

Simply reading the integer cannot corrupt it even when being done by multiple threads at the same time and being written by a single one, so this will ensure your reader threads will not read beyond the limit, and such limit will always be safe and consistent, while the writer-thread can peacefully write data in an area that is guaranteed to not be bothered by read-threads until it is finished.

As for your second question, if you are indeed able to keep the currently-being-written file fully in memory, that will already save up some HDD usage, as well as time. Additionally, keep in mind most modern HDDs have 32Mb+ of cache, so it is not like every read and write will be directly hitting the HDD itself, unless you have a ton of threads reading random files and random parts of them all the time. If that is the case, there is probably not much you can do to help the HDD. And if that's not the case, there is not much to worry about, as the OS and the caches will do what they were meant to do :)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM