简体   繁体   English

Windows上的C ++中的I / O性能差异很大

[英]Very different I/O performance in C++ on Windows

I'm a new user and my english is not so good so I hope to be clear. 我是新用户,我的英语不太好,所以我希望能够清楚。 We're facing a performance problem using large files (1GB or more) expecially (as it seems) when you try to grow them in size. 当你尝试增加它们的大小时,我们正面临使用大文件(1GB或更多)的性能问题(特别是)。

Anyway... to verify our sensations we tryed the following (on Win 7 64Bit, 4core, 8GB Ram, 32 bit code compiled with VC2008) 无论如何...为了验证我们的感觉我们尝试了以下(在Win 7 64Bit,4核,8GB Ram,用VC2008编译的32位代码)

a) Open an unexisting file. a)打开一个未存在的文件。 Write it from the beginning up to 1Gb in 1Mb slots. 在1Mb​​插槽中从一开始写入1Gb。
Now you have a 1Gb file. 现在你有一个1Gb的文件。
Now randomize 10000 positions within that file, seek to that position and write 50 bytes in each position, no matter what you write. 现在随机化该文件中的10000个位置,寻找该位置并在每个位置写入50个字节,无论您写什么。
Close the file and look at the results. 关闭文件并查看结果。
Time to create the file is quite fast (about 0.3" ), time to write 10000 times is fast all the same (about 0.03" ). 创建文件的时间非常快(约0.3“ ),写入10000次的时间快得多(约0.03” )。

Very good, this is the beginnig. 非常好,这是开始。
Now try something else... 现在尝试别的......

b) Open an unexisting file, seek to 1Gb-1byte and write just 1 byte. b)打开一个未存在的文件,寻找1Gb-1byte并只写1个字节。
Now you have another 1Gb file. 现在你有另一个1Gb文件。
Follow the next steps exactly same way of case 'a', close the file and look at the results. 按照与案例“a”完全相同的方式执行后续步骤,关闭文件并查看结果。
Time to create the file is the faster you can imagine (about 0.00009" ) but write time is something you can't believe.... about 90"!!!!! 创建文件的时间是你能想象的更快(大约0.00009“ ),但写作时间是你无法相信的...... 约90”!!!!!
b.1) Open an unexisting file, don't write any byte. b.1)打开一个未存在的文件,不要写任何字节。
Act as before, ramdomizing, seeking and writing, close the file and look at the result. 像以前一样,ramdomizing,寻找和写作,关闭文件,看看结果。
Time to write is long all the same: about 90"!!!!! 写作的时间很长都是一样的: 约90“!!!!!

Ok... this is quite amazing. 好的......这真是太神奇了。 But there's more! 但还有更多!

c) Open again the file you crated in case 'a', don't truncate it... randomize again 10000 positions and act as before. c)再次打开您在“a”情况下创建的文件,不要截断它...再次随机化10000个位置并像以前一样行动。 You're fast as before, about 0,03" to write 10000 times. 你像以前一样快,大约0,03“写10000次。

This sounds Ok... try another step. 这听起来很好......尝试另一步。

d) Now open the file you created in case 'b', don't truncate it... randomize again 10000 positions and act as before. d)现在打开你创建的文件'b',不要截断它...再次随机化10000个位置并像以前一样行动。 You're slow again and again, but the time is reduced to... 45"!! Maybe, trying again, the time will reduce. 你一次又慢,但时间减少到...... 45“!!也许,再试一次,时间会减少。

I actually wonder why... Any Idea? 我真的想知道为什么......任何想法?

The following is part of the code I used to test what I told in previuos cases (you'll have to change someting in order to have a clean compilation, I just cut & paste from some source code, sorry). 以下是我用来测试我在previuos案例中所说的内容的代码的一部分(你必须更改某些内容才能进行简洁的编译,我只是剪切并粘贴一些源代码,对不起)。
The sample can read and write, in random, ordered or reverse ordered mode, but write only in random order is the clearest test. 样本可以随机,有序或反向排序模式进行读写,但只能以随机顺序写入是最明确的测试。
We tryed using std::fstream but also using directly CreateFile(), WriteFile() and so on the results are the same (even if std::fstream is actually a little slower). 我们尝试使用std :: fstream但也直接使用CreateFile(),WriteFile()等结果是相同的(即使std :: fstream实际上有点慢)。

Parameters for case 'a' => -f_tempdir_\\casea.dat -n10000 -t -p -w 案例'a'的参数=> -f_tempdir_ \\ casea.dat -n10000 -t -p -w
Parameters for case 'b' => -f_tempdir_\\caseb.dat -n10000 -t -v -w 案例'b'的参数=> -f_tempdir_ \\ caseb.dat -n10000 -t -v -w
Parameters for case 'b.1' => -f_tempdir_\\caseb.dat -n10000 -t -w case'b.1'的参数=> -f_tempdir_ \\ caseb.dat -n10000 -t -w
Parameters for case 'c' => -f_tempdir_\\casea.dat -n10000 -w case'c'=> -f_tempdir_ \\ casea.dat -n10000 -w的参数
Parameters for case 'd' => -f_tempdir_\\caseb.dat -n10000 -w case'd'=> -f_tempdir_ \\ caseb.dat -n10000 -w的参数

Run the test (and even others) and see... 运行测试(甚至其他人),看看......

  // iotest.cpp : Defines the entry point for the console application.
  //

  #include <windows.h>
  #include <iostream>
  #include <set>
  #include <vector>
  #include "stdafx.h"

  double RealTime_Microsecs()
  {
     LARGE_INTEGER fr = {0, 0};
     LARGE_INTEGER ti = {0, 0};
     double time = 0.0;

     QueryPerformanceCounter(&ti);
     QueryPerformanceFrequency(&fr);

     time = (double) ti.QuadPart / (double) fr.QuadPart;
     return time;
  }

  int main(int argc, char* argv[])
  {
     std::string sFileName ;
     size_t stSize, stTimes, stBytes ;
     int retval = 0 ;

     char *p = NULL ;
     char *pPattern = NULL ;
     char *pReadBuf = NULL ;

     try {
        // Default
        stSize = 1<<30 ; // 1Gb
        stTimes = 1000 ;
        stBytes = 50 ;

        bool bTruncate = false ;
        bool bPre = false ;
        bool bPreFast = false ;
        bool bOrdered = false ;
        bool bReverse = false ;
        bool bWriteOnly = false ;

        // Comsumo i parametri
        for(int index=1; index < argc; ++index)
        {
           if ( '-' != argv[index][0] ) throw ;
           switch(argv[index][1])
           {
           case 'f': sFileName = argv[index]+2 ;
              break ;
           case 's': stSize = xw::str::strtol(argv[index]+2) ;
              break ;
           case 'n': stTimes = xw::str::strtol(argv[index]+2) ;
              break ;
           case 'b':stBytes = xw::str::strtol(argv[index]+2) ;
              break ;
           case 't': bTruncate = true ;
              break ;
           case 'p' : bPre = true, bPreFast = false ;
              break ;
           case 'v' : bPreFast = true, bPre = false ;
              break ;
           case 'o' : bOrdered = true, bReverse = false ;
              break ;
           case 'r' : bReverse = true, bOrdered = false ;
              break ;
           case 'w' : bWriteOnly = true ;
              break ;
           default: throw ;
              break ;
           }
        }

        if ( sFileName.empty() )
        {
           std::cout << "Usage: -f<File Name> -s<File Size> -n<Number of Reads and Writes> -b<Bytes per Read and Write> -t -p -v -o -r -w" << std::endl ;
           std::cout << "-t truncates the file, -p pre load the file, -v pre load 'veloce', -o writes in order mode, -r write in reverse order mode, -w Write Only" << std::endl ;
           std::cout << "Default: 1Gb, 1000 times, 50 bytes" << std::endl ;
           throw ;
        }

        if ( !stSize || !stTimes || !stBytes )
        {
           std::cout << "Invalid Parameters" << std::endl ;
           return -1 ;
        }

        size_t stBestSize = 0x00100000 ;


        std::fstream fFile ;
        fFile.open(sFileName.c_str(), std::ios_base::binary|std::ios_base::out|std::ios_base::in|(bTruncate?std::ios_base::trunc:0)) ;

        p = new char[stBestSize] ;
        pPattern = new char[stBytes] ;
        pReadBuf = new char[stBytes] ;
        memset(p, 0, stBestSize) ;
        memset(pPattern, (int)(stBytes&0x000000ff), stBytes) ;

        double dTime = RealTime_Microsecs() ;

        size_t stCopySize, stSizeToCopy = stSize ;

        if ( bPre )
        {
           do {
              stCopySize = std::min(stSizeToCopy, stBestSize) ;
              fFile.write(p, stCopySize) ;
              stSizeToCopy -= stCopySize ;
           } while (stSizeToCopy) ;
           std::cout << "Creating time is: " << xw::str::itoa(RealTime_Microsecs()-dTime, 5, 'f') << std::endl ;
        }
        else if ( bPreFast )
        {
           fFile.seekp(stSize-1) ;
           fFile.write(p, 1) ;
           std::cout << "Creating Fast time is: " << xw::str::itoa(RealTime_Microsecs()-dTime, 5, 'f') << std::endl ;
        }

        size_t stPos ;

        ::srand((unsigned int)dTime) ;

        double dReadTime, dWriteTime ;

        stCopySize = stTimes ;

        std::vector<size_t> inVect ;
        std::vector<size_t> outVect ;
        std::set<size_t> outSet ;
        std::set<size_t> inSet ;

        // Prepare vector and set
        do {
           stPos = (size_t)(::rand()<<16) % stSize ;
           outVect.push_back(stPos) ;
           outSet.insert(stPos) ;

           stPos = (size_t)(::rand()<<16) % stSize ;
           inVect.push_back(stPos) ;
           inSet.insert(stPos) ;
        } while (--stCopySize) ;

        // Write & read using vectors
        if ( !bReverse && !bOrdered )
        {
        std::vector<size_t>::iterator outI, inI ;
        outI = outVect.begin() ;
        inI = inVect.begin() ;
        stCopySize = stTimes ;
        dReadTime = 0.0 ;
        dWriteTime = 0.0 ;
        do {
           dTime = RealTime_Microsecs() ;
           fFile.seekp(*outI) ;
           fFile.write(pPattern, stBytes) ;
           dWriteTime += RealTime_Microsecs() - dTime ;
           ++outI ;

           if ( !bWriteOnly )
           {
              dTime = RealTime_Microsecs() ;
              fFile.seekg(*inI) ;
              fFile.read(pReadBuf, stBytes) ;
              dReadTime += RealTime_Microsecs() - dTime ;
              ++inI ;
           }
        } while (--stCopySize) ;
        std::cout << "Write time is " << xw::str::itoa(dWriteTime, 5, 'f') << " (Ave: " << xw::str::itoa(dWriteTime/stTimes, 10, 'f') << ")" <<  std::endl ;
        if ( !bWriteOnly )
        {
           std::cout << "Read time is " << xw::str::itoa(dReadTime, 5, 'f') << " (Ave: " << xw::str::itoa(dReadTime/stTimes, 10, 'f') << ")" << std::endl ;
        }
        } // End

        // Write in order
        if ( bOrdered )
        {
           std::set<size_t>::iterator i = outSet.begin() ;

           dWriteTime = 0.0 ;
           stCopySize = 0 ;
           for(; i != outSet.end(); ++i)
           {
              stPos = *i ;
              dTime = RealTime_Microsecs() ;
              fFile.seekp(stPos) ;
              fFile.write(pPattern, stBytes) ;
              dWriteTime += RealTime_Microsecs() - dTime ;
              ++stCopySize ;
           }
           std::cout << "Ordered Write time is " << xw::str::itoa(dWriteTime, 5, 'f') << " in " << xw::str::itoa(stCopySize) << " (Ave: " << xw::str::itoa(dWriteTime/stCopySize, 10, 'f') << ")" <<  std::endl ;

           if ( !bWriteOnly )
           {
              i = inSet.begin() ;

              dReadTime = 0.0 ;
              stCopySize = 0 ;
              for(; i != inSet.end(); ++i)
              {
                 stPos = *i ;
                 dTime = RealTime_Microsecs() ;
                 fFile.seekg(stPos) ;
                 fFile.read(pReadBuf, stBytes) ;
                 dReadTime += RealTime_Microsecs() - dTime ;
                 ++stCopySize ;
              }
              std::cout << "Ordered Read time is " << xw::str::itoa(dReadTime, 5, 'f') << " in " << xw::str::itoa(stCopySize) << " (Ave: " << xw::str::itoa(dReadTime/stCopySize, 10, 'f') << ")" <<  std::endl ;
           }
        }// End

        // Write in reverse order
        if ( bReverse )
        {
           std::set<size_t>::reverse_iterator i = outSet.rbegin() ;

           dWriteTime = 0.0 ;
           stCopySize = 0 ;
           for(; i != outSet.rend(); ++i)
           {
              stPos = *i ;
              dTime = RealTime_Microsecs() ;
              fFile.seekp(stPos) ;
              fFile.write(pPattern, stBytes) ;
              dWriteTime += RealTime_Microsecs() - dTime ;
              ++stCopySize ;
           }
           std::cout << "Reverse ordered Write time is " << xw::str::itoa(dWriteTime, 5, 'f') << " in " << xw::str::itoa(stCopySize) << " (Ave: " << xw::str::itoa(dWriteTime/stCopySize, 10, 'f') << ")" <<  std::endl ;

           if ( !bWriteOnly )
           {
              i = inSet.rbegin() ;

              dReadTime = 0.0 ;
              stCopySize = 0 ;
              for(; i != inSet.rend(); ++i)
              {
                 stPos = *i ;
                 dTime = RealTime_Microsecs() ;
                 fFile.seekg(stPos) ;
                 fFile.read(pReadBuf, stBytes) ;
                 dReadTime += RealTime_Microsecs() - dTime ;
                 ++stCopySize ;
              }
              std::cout << "Reverse ordered Read time is " << xw::str::itoa(dReadTime, 5, 'f') << " in " << xw::str::itoa(stCopySize) << " (Ave: " << xw::str::itoa(dReadTime/stCopySize, 10, 'f') << ")" <<  std::endl ;
           }
        }// End

        dTime = RealTime_Microsecs() ;
        fFile.close() ;
        std::cout << "Flush/Close Time is " << xw::str::itoa(RealTime_Microsecs()-dTime, 5, 'f') << std::endl ;

        std::cout << "Program Terminated" << std::endl ;

     }
     catch(...)
     {
        std::cout << "Something wrong or wrong parameters" << std::endl ;
        retval = -1 ;
     }

     if ( p ) delete []p ;
     if ( pPattern ) delete []pPattern ;
     if ( pReadBuf ) delete []pReadBuf ;

     return retval ;
  }

If you aren't flushing the changes, and you've got enough free ram to cache it, you are just measuring the speed of your ram. 如果你没有冲洗这些变化,并且你有足够的免费ram来缓存它,那么你只是在测量你的ram的速度。

Also, be aware that some filesystems support "holes", areas of a file which aren't allocated. 另外,请注意某些文件系统支持“漏洞”,文件区域未分配。 If the filesystem has such support (I don't know whether any do on Windows, but they may), then the test which does a "seek" to 1G then writes 1 byte, will create a file which is mostly a "hole", hence requires few blocks to be written. 如果文件系统有这样的支持(我不知道是否有任何在Windows上做,但他们可能),那么执行“寻找”到1G然后写入1个字节的测试将创建一个主要是“漏洞”的文件因此需要很少的块来写。

Finally, you should repeat every test lots and lots of times, flushing the entire cache each time (this is possible without reboot on Windows using a sysinternals tool) and running on an otherwise idle machine with any on-access AV software disabled. 最后,您应该重复每次测试多次,每次刷新整个缓存(这可以在Windows上使用sysinternals工具重新启动),并在禁用任何按访问AV软件的其他空闲机器上运行。 If you see big discrepancies in the performance, something strange is happening. 如果你发现性能存在很大差异,那就会发生一些奇怪的事情。


Be absolutely sure that any writes done by the previous test are completely flushed to disc before beginning the next test, otherwise it will screw your data up completely. 绝对确保在开始下一次测试之前,先前测试完成的任何写入都会完全刷新到光盘,否则会完全搞砸数据。

Lastly, don't take any notice of performance data on a VM. 最后,不要注意VM上的性能数据。 Do all performance testing on real hardware if you want consistent results (EVEN if you are deploying into a VM). 如果需要一致的结果,请在真实硬件上进行所有性能测试(即使您要部署到VM中也是如此)。

It might well depend on the filesystem that you are using. 它可能完全取决于您使用的文件系统。 I don't really know much about the NTFS filesystem, but many FS optimize away empty space in sparse files , where empty space is defined as blocks in the file for which nothing was written. 我对NTFS文件系统并不是很了解,但是许多FS优化了稀疏文件中的空白空间,其中空白空间被定义为文件中没有写入任何内容的块。 Note that reads will return 0 for these positions, but nothing was written. 请注意,对于这些位置,读取将返回0,但没有写入任何内容

The first test creates and allocates all required blocks in the file. 第一个测试创建并分配文件中的所有必需块。 Then the random modifications just modify the contents of already allocated blocks of disk. 然后随机修改只修改已分配的磁盘块的内容。 The rest of the tests create the file, but never actually write to it, so the FS might contain the size of the file but no allocated blocks. 其余测试创建文件,但从未实际写入,因此FS可能包含文件的大小但不包含已分配的块。 The test in this case has to seek, allocate a block from the disk to the file, initialize it all to 0s and then write the data. 在这种情况下的测试必须寻找,从磁盘分配块到文件,将其全部初始化为0,然后写入数据。 You can check the actual space allocated from the disk in all cases as shown here . 您可以在所有情况下检查从磁盘分配的实际空间显示在这里

The third block of results 45", might as well be related to the random locations hitting a balance of allocated (fast writes) and unallocated (slow writes) blocks. 第三个结果块45“,也可能与随机位置有关,这些位置达到了分配(快速写入)和未分配(慢速写入)块的平衡。

EDIT: It seems that NTFS does have this type of Sparse File optimization. 编辑:似乎NTFS确实有这种类型的稀疏文件优化。

It's all down to disk caching and file system structure. 这一切都归功于磁盘缓存和文件系统结构。 Some read/write operation will only need to operate on a cache if the data is already in there and be flushed later. 如果数据已经在那里并且稍后将被刷新,则一些读/写操作将仅需要在高速缓存上操作。 Other operations must operate on the disk as the data is not in the cache and will be much slower. 其他操作必须在磁盘上运行,因为数据不在缓存中并且速度会慢得多。

The arrangement of blocks in the filesystem could also affect these things. 文件系统中块的排列也会影响这些事情。 In the extreme, adding that one byte to the GB file may cause a good portion of it to be moved on the disk. 在极端情况下,将一个字节添加到GB文件可能会导致其中很大一部分在磁盘上移动。 Other factors such as fragmentation, block size and the way read/writes are optimised and cached could cause the effects you see. 其他因素(如碎片,块大小和读/写方式)都经过优化和缓存可能会导致您看到的效果。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM