简体   繁体   中英

Why does reading take so much time?

I have this simple C++ code for loading images into buffer.

FILE *f = fopen("myimage.jpg", "rb");
fseek(f,0,SEEK_END);
unsigned long fsize = ftell(f);
rewind(f);
unsigned char *buf = new unsinged char[size];
fread(buf, 1, fsize, f);
fclose(f);

I use ~1MB large files from my HDD for testing, and the execution time is usually between 100 - 200 ms. That seems too much for me. I've done a benchmark of my HDD and the reading speed was about 80 MB/s. That would mean that reading of 1MB file should take 1000/80 = 12.5 ms, right? Is it actually possible to read the file in such time? Is there anything wrong with my code?

For a hard drive here are some of the required actions:

  1. Get the drive up to proper speed (aka 5400 rev/second).
  2. Read the directory (the drive may have already cached this).
  3. Search the directory for the starting track and sector.
  4. Position the head to the correct track.
  5. Wait for the sector to come around.
  6. Read the bits from the sector.
  7. {optional} Reposition the head to the next block of data (sector & track)
  8. Read more data.
  9. The bits have to travel from the hard drive to the memory on the platform. There may be wait times, because the address bus and the data bus are shared with many devices.

For an SSD drive, you can skip the part about waiting for the platters to speed up and waiting for sectors to come around. The location of sectors is calculated by math. Don't need to reposition a read head.

All of the above takes time.

One of the roadblocks is having to reposition the head of the hard drive to read another sector. Contiguous sectors have less overhead than fragmented sectors.

Edit 1: Increasing performance
You can increase the read performance by purchasing a hard drive that has a faster spin rate and a larger cache.

You can use an application that will Defragment your hard drive. This reduces the need to reposition the head to different locations for the next block.

Probably, the best performance optimization is to keep the data flowing. There is an overhead with each transaction (aka read). For example, reading 1 byte has the same overhead as reading 1024 bytes, thus 1 read of 1024 bytes is more efficient than 1024 reads of one byte.

In keeping up with the data flowing principle, use multiple threads. Use one thread that reads into one or more buffers. If one buffer fills up before the other threads finish processing the buffer, read into another one. Keep the data flowing.

Don't search the file for data, use memory. Rather than reading until a newline, read in a huge block of data into memory. Search the memory for the newline. Searching memory is always faster than searching a hard drive.

Lastly, if your platform has hardware support, use it. Some platforms have a DMA (Direct Memory Access) chip. This chip can read data from a port and store into memory without using the CPU. This allows your CPU to execute instructions while the data is dumped into memory by the DMA chip. Again, block transfers will be more optimal that single byte transfers. You will have to check the data sheet on your platform to see if your I/O hardware has DMA and that the processor can access it.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM