简体   繁体   中英

C++: IO performance issue

I have an large array in memory. I am writing this in file using:

             FILE* fp = fopen("filename", "wb");
             fwrite(array, sizeof(uint32_t), 1500000000 , fp); // array saved
             fflush(fp) ;
             fclose(fp);

and reading it again using:

              FILE* fp = fopen("filename", "rb");
              fread(array, sizeof(uint32_t), 1500000000 , fp);
              fclose(fp);

For, writing it takes 7 sec and for reading it takes 5 sec.

Actually, I have not to write whole array. I have to write and read it by checking some conditions. Like (example case):

#include<iostream>
#include <stdint.h>
#include <cstdio>
#include <cstdlib>
#include <sstream>

using namespace std;

main()
{
      uint32_t* ele = new uint32_t [100] ;
      for(int i = 0; i < 100 ; i++ )
      ele[i] = i ;

      for(int i = 0; i < 100 ; i++ ){
          if(ele[i] < 20)
          continue ;
          else
          // write  ele[i] to file
          ;   
      }

 for(int i = 0; i < 100 ; i++ ){
          if(ele[i] < 20)
          continue ;
          else
          // read  number from file
          // ele[i] = number * 10 ;
          ;   
      }

     std::cin.get();
}

For this reason what I am doing is:

writing using:

for(int i = 0; i < 1500000000 ; i++ ){
if (arrays[i] < 10000000)
continue ;
uint32_t number = arrays[i] ;
fwrite(&number, sizeof(uint32_t), 1, fp1);
}

And reading using: fread(&number, sizeof(uint32_t), 1, fp1);

This case: writing takes 2.13 min and for reading it takes 1.05 min.

Which is quite long time for me. Can anybody help me, why is this happening (in second case file size is less than first one) ? And How to solve this issue ? Any other better approach ?

I benchmarked this a little while ago , and on my box lots of small fwrite() calls can only sustain about 90 MB/s (the disk is much faster than this so the test was not disk-bound).

My suggestion would be to do your own buffering: write the values into an intermediate array, and from time to time write out the entire array using a single fwrite() .

Writing just once will be way faster. I would suggest you construct an auxillary array with just the elements you want to print and the write this array in a single fwrite call. Of course this will take additional memory, but that's the standard tradeoff - memory for performance.

Even though C's FILE* routines are buffered, there's still a fair amount of overhead to each call - ending up doing millions of integer-sized reads/writes will kill your performance.

EDIT: are you doing integer-sized reads as an attempt at speed optimizing? Or are you doing it for some data consistency reasons (ie, an integer in the array must only be updated if condition is true)?

If it's for consistency reasons, consider reading a chunk (probably 4k or larger) at a time, then do the compare-and-possibly-update from the chunk of data - or use memory mapped files, if it's available on your target platform(s).

The title of the question says C++, so why not use the excellent buffered stream facilities? Does C++ ofstream file writing use a buffer?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM