简体   繁体   中英

Performance of copying a file with fread/fwrite to USB

I'm in front of a piece of code, which copies a file to a usb-device. Following part is the important one:

while((bytesRead = fread(buf, 1, 16*1024, m_hSource)) && !bAbort) {
    // write to target
    long bytesWritten = fwrite(buf, 1, bytesRead, m_hTarget);

    m_lBytesCopied += bytesWritten;

The thing, the customer said, it's pretty slow in comparison to normal pc<->usb speed. I didn't code this, so it's my job, to optimize.

So I was wondering, if it's a better approach to first read the complete file and then write the file in one step. But I don't know how error-prone this would be. The code also check after each copystep if all bytes where written correctly, so that might also slow down the process.

I'm not that c++ & hardware guru, so I'm asking you guys, how I could speed things up and keep the copying successful.

  1. Try to read/write in big chunk . 16M, 32M are not bad for copying file.
  2. If you just want to copy the file you can always invoke system() It'll be faster.
  3. The code also check after each copystep if all bytes where written correctly, so that might also slow down the process.

    You can check it by creating hash of bigger chunk. Like splitting the file into 64M chunks. Then match hashes of those chunks. Bittorrent protocol has this feature.

  4. If you have mmap or MapViewOfFile available, map the file first . Then write it to usb. This way read operation will be handled by kernel .

  5. Kerrek just commented about using memcpy on mmap . memcpy with 2 mmap ed file seems great.

Also note that, Most recent operating systems writes to USB stick when they are being removed. Before removal it just writes the data in a cache. So copy from OS may appear faster.

What about overlapping reads and writes?

In the current code, the total time is time(read original) + time(write copy) , if you read the first block, then while writing it start reading the second block, etc. your total time would be max(time(read original), time(write copy)) (plus the time reading/writing the first and last blocks that won't be pipelined).

It could be almost half the time if reading and writing takes more or less the same time.

You can do it with two threads or with asynchronous IO. Unfortunately, threads and async IO are platform dependent, so you'll have to check your system manual or choose appropriate portable libraries.

I would just go with some OS specific functions that for sure do this faster that anything written only with c/c++ functions.

For Linux this could be sendfile function. For Windows CopyFile will do the job.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM