简体   繁体   English

哪个更快,将原始数据写入驱动器或写入文件?

[英]Which is faster, writing raw data to a drive, or writing to a file?

I need to write data into drive. 我需要将数据写入驱动器。 I have two options: 我有两个选择:

  1. write raw sectors.(_write(handle, pBuffer, size);) 写原始扇区。(_ write(handle,pBuffer,size);)
  2. write into a file (fwrite(pBuffer, size, count, pFile);) 写入文件(fwrite(pBuffer,size,count,pFile);)

Which way is faster? 哪种方式更快?

I expected the raw sector writing function, _write, to be more efficient. 我期望原始扇区写入函数_write更有效。 However, my test result failed! 但是,我的测试结果失败了! fwrite is faster. fwrite更快。 _write costs longer time. _write成本较长。

I've pasted my snippet; 我贴了我的片段; maybe my code is wrong. 也许我的代码是错的。 Can you help me out? 你能帮我吗? Either way is okay by me, but I think raw write is better, because it seems the data in the drive is encrypted at least.... 无论哪种方式都可以,但我认为原始写入更好,因为看起来驱动器中的数据至少是加密的....

#define SSD_SECTOR_SIZE 512
int g_pSddDevHandle = _open("\\\\.\\G:",_O_RDWR | _O_BINARY, _S_IREAD | _S_IWRITE);
TIMER_START();
while (ulMovePointer < 1024 * 1024 * 1024)
{
    _write(g_pSddDevHandle,szMemZero,SSD_SECTOR_SIZE);
    ulMovePointer += SSD_SECTOR_SIZE;
}
TIMER_END();
TIMER_PRINT();
FILE * file = fopen("f:\\test.tmp","a+");
TIMER_START();
while (ulMovePointer < 1024 * 1024 * 1024)
{
    fwrite(szMemZero,SSD_SECTOR_SIZE,1,file);
    ulMovePointer += SSD_SECTOR_SIZE;
}
TIMER_END();
TIMER_PRINT();

Probably because a direct write isn't buffered. 可能是因为没有缓冲直接写入。 When you call fwrite , you are doing buffered writes which tend to be faster in most situations. 当你调用fwrite ,你正在进行缓冲写入,这在大多数情况下往往更快。 Essentially, each FILE* handler has an internal buffer which is flushed to disk periodically when it becomes full, which means you end up making less system calls, as you only write to disk in larger chunks. 本质上,每个FILE*处理程序都有一个内部缓冲区,当它变满时会定期刷新到磁盘,这意味着您最终会减少系统调用,因为您只能以较大的块写入磁盘。

To put it another way, in your first loop, you are actually writing SSD_SECTOR_SIZE bytes to disk during each iteration. 换句话说,在第一个循环中,实际上是在每次迭代期间将SSD_SECTOR_SIZE字节写入磁盘。 In your second loop you are not. 在你的第二个循环中,你不是。 You are only writing SSD_SECTOR_SIZE bytes to a memory buffer, which, depending on the size of the buffer, will only be flushed every Nth iteration. 您只将SSD_SECTOR_SIZE字节写入内存缓冲区,这取决于缓冲区的大小,每隔N次迭代才会刷新一次。

In the _write() case, the value of SSD_SECTOR_SIZE matters. 在_write()情况下,SSD_SECTOR_SIZE的值很重要。 In the fwrite case, the size of each write will actually be BUFSIZ. 在fwrite情况下,每次写入的大小实际上是BUFSIZ。 To get a better comparison, make sure the underlying buffer sizes are the same. 为了更好地进行比较,请确保底层缓冲区大小相同。

However, this is probably only part of the difference. 然而,这可能只是差异的一部分。

In the fwrite case, you are measuring how fast you can get data into memory. 在fwrite的情况下,您正在测量将数据存入内存的速度。 You haven't flushed the stdio buffer to the operating system, and you haven't asked the operating system to flush its buffers to physical storage. 您尚未将stdio缓冲区刷新到操作系统,并且您没有要求操作系统将其缓冲区刷新到物理存储。 To compare more accurately, you should call fflush() before stopping the timers. 为了更准确地进行比较,您应该在停止计时器之前调用fflush()。

If you actually care about getting data onto the disk rather than just getting the data into the operating systems buffers, you should ensure that you call fsync()/FlushFileBuffers() before stopping the timer. 如果您真的关心将数据放入磁盘而不是仅仅将数据放入操作系统缓冲区,则应确保在停止计时器之前调用fsync()/ FlushFileBuffers()。

Other obvious differences: 其他明显的差异:

  • The drives are different. 驱动器是不同的。 I don't know how different. 我不知道有多么不同。

  • The semantics of a write to a device are different to the semantics of writes to a filesystem; 写入设备的语义与写入文件系统的语义不同; the file system is allowed to delay writes to improve performance until explicitly told not to (eg. with a standard handle, a call to FlushFileBuffers()); 允许文件系统延迟写入以提高性能,直到明确告知不要(例如,使用标准句柄,调用FlushFileBuffers()); writes directly to a device aren't necessarily optimised in that way. 直接写入设备不一定以这种方式进行优化。 On the other hand, the file system must do extra I/O to manage metadata (block allocation, directory entries, etc.) 另一方面,文件系统必须执行额外的I / O来管理元数据(块分配,目录条目等)

I suspect that you're seeing a different in policy about how fast things actually get on to the disk. 我怀疑你看到一个不同的政策,关于事物实际进入磁盘的速度有多快。 Raw disk performance can be very fast, but you need big writes and preferably multiple concurrent outstanding operations. 原始磁盘性能可能非常快,但您需要大写入,最好是多个并发未完成的操作。 You can also avoid buffer copying by using the right options when you open the handle. 您还可以在打开句柄时使用正确的选项来避免缓冲区复制。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM