简体   繁体   English

程序员的思考过程:确定将ReadFile与Windows API一起使用时要读取的最大字节数

[英]Programmer thought process: determining a maximum number of bytes to read when using ReadFile with the Windows API

I need to call the ReadFile function of the Windows API: 我需要调用Windows API的ReadFile函数:

BOOL WINAPI ReadFile(
  _In_        HANDLE       hFile,
  _Out_       LPVOID       lpBuffer,
  _In_        DWORD        nNumberOfBytesToRead,
  _Out_opt_   LPDWORD      lpNumberOfBytesRead,
  _Inout_opt_ LPOVERLAPPED lpOverlapped
);

The argument I'm interested in is the 3rd one: 我感兴趣的参数是第三个参数:

nNumberOfBytesToRead [in] nNumberOfBytesToRead [输入]

The maximum number of bytes to be read. 读取的最大字节数。

I'm not interested so much in the "magic number" to put there but the process a seasoned programmer takes to determine the number to put there, preferably in numbered steps. 我对放置在其中的“魔术数字”不太感兴趣,但是经验丰富的程序员确定放置在其中的数字的过程最好采用编号的步骤。

Also keep in mind I am writing my program in assembler so I'm more interested in the thought process from that perspective. 另外请记住,我正在用汇编器编写程序,因此从该角度来看,我对思考过程更感兴趣。


This requires plenty of insight into both Windows and your hardware. 这需要深入了解Windows和您的硬件。 But, in general, here are some possible directions: 但是,总的来说,这是一些可能的方向:

  • Is the write buffered or unbuffered? 是写缓冲还是无缓冲? If unbuffered, then you may not even be able to choose the size, but have to follow strict rules for both the size and the alignment of the buffer. 如果没有缓冲,那么您甚至可能无法选择大小,但是必须对缓冲区的大小和对齐方式遵循严格的规则。
  • In general, you'd want to let the operating system handle as much of the work as possible, because it knows a lot more about the storage device itself and its various users than you do in userspace. 通常,您希望让操作系统处理尽可能多的工作,因为它比您在用户空间中了解更多有关存储设备本身及其各种用户的知识。 So you might want to fetch the whole thing at once, if possible (see points below). 因此,如果可能的话,您可能想一次获取整个内容(请参见以下几点)。
  • If it turns out that that isn't good enough, you may try to outsmart it by playing around with various sizes, to account for cases where you might be able to use current buffers which the OS, for some reason, wouldn't always make use of for different requests. 如果事实证明这还不够好,您可以尝试通过使用各种大小来使其胜过智能,以解决由于某些原因您可能能够使用当前缓冲区的情况,但由于某些原因,OS可能并不总是如此利用不同的要求。
  • Otherwise, you might play around with sizes ranging anywhere between the disk sector size and multiples of the page size, as these are most likely to already be cached somewhere, and also to map directly to actual hardware requests. 否则,您可能会在磁盘扇区大小和页面大小倍数之间的任意大小范围内玩耍,因为它们很可能已经缓存在某个地方,并且还可以直接映射到实际的硬件请求。
  • Other than performance, there's the question of how much you can afford to store in your process's memory at any given time. 除了性能之外,还有一个问题:在任何给定时间,您可以负担多少存储在进程的内存中。
  • There's also the question of sending large requests which might block other processes from getting the chance to get in there and get some data in between—if the OS doesn't already take care of that somehow. 还有一个问题是发送大型请求,这可能会阻止其他进程获得机会进入其中并在它们之间获取一些数据(如果OS尚未以某种方式解决的话)。
  • There's also the possibility that by requesting too-large chunks the OS might defer your request till other processes get their humble ones served. 也有可能通过请求太大的块,操作系统可能将您的请求推迟到其他进程得到谦卑的服务之前。 On the flip side, if it's to intersecting addresses, it might actually serve yours first in order to then serve the other ones from the cache. 另一方面,如果要相交地址,则它实际上可能首先为您服务,然后再从缓存中服务其他地址。

In general, you'd probably want to play around until you get something that works well enough. 通常,您可能会想尝试一下,直到获得足够好的效果。

That paremeter is there only to protect you from buffer overflow, so you of course must enter size of the buffer you allocated for this purpose. 那里的参数只是为了防止缓冲区溢出,因此,您当然必须输入为此目的分配的缓冲区的大小。 Other than that you should only read as many bytes as you are interested in this exact time. 除此之外,您应该只在此确切时间内读取尽可能多的字节。 Modern OS will always use pagecache and any following access to the file will be as fast as accessing RAM. 现代操作系统将始终使用页面缓存,并且对文件的任何后续访问都将与访问RAM一样快。 You can also force the OS to cache the file beforehand if you need it whole. 如果需要整个文件,也可以强制操作系统预先缓存文件。
Edit: My experience is against what Yam Marcovic and others recommend. 编辑:我的经验与Yam Marcovic和其他人的建议不符。 Caching files and chunking reads to ideal sizes is exactly the thing OS is there to do. 将文件缓存和分块读取到理想大小正是OS要做的事情。 Do not presume to outsmart it and read just what you need. 不要以为是智商,只能阅读所需内容。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM