简体   繁体   English

在Linux中读取文件最快的方法?

[英]Fastest way of reading a file in Linux?

On Linux what would be the fastest way of reading a file in to an array of bytes/to process the bytes? 在Linux上,将文件读入字节数组/处理字节的最快方法是什么? This can include memory-mapping, sys calls etc. I am not familiar with the many Linux-specific functions. 这可能包括内存映射,系统调用等。我不熟悉许多Linux特定的功能。

In the past I have used boost memory mapping, but I need faster Linux-specific performance rather than portability. 过去,我使用过增强内存映射,但是我需要更快的Linux特定性能,而不是可移植性。

mmap should be the fastest way to access the contents of a file if the file is large enough. 如果文件足够大, mmap应该是访问文件内容的最快方法。 There's an initial cost for setting up the memory mappings, but that's offset by not needing to copy the data from the page cache into userland. 设置内存映射有一个初始成本,但是由于不需要将数据从页面缓存复制到用户区而被抵消。 And if you want all the contents of the file, the cost to allocate the memory to your program should be more or less the same as the cost of mmap . 而且,如果您需要文件的所有内容,则将内存分配给程序的成本应与mmap的成本大致相同。

Your best bet, as always, is to test and benchmark. 与往常一样,最好的选择是进行测试和基准测试。

Don't let yourself get fooled by lazy stuff like memory mapping. 不要让自己被诸如内存映射之类的懒惰东西所骗。 Rather focus on what you really need. 而是专注于您真正需要的东西。 Do you really need to read the whole file into memory? 您是否真的需要将整个文件读入内存? Then the straight-forward way of opening, reading chunks in a loop, and closing the file will be as fast as it can be done. 然后,直接打开,循环读取块和关闭文件的方式将尽可能快地完成。

But often you don't really want that. 但是通常您并不真的想要那个。 Instead you might want to read specific parts, a block here, a block there, jump through the file, read a block at a specific position, etc. 取而代之的是,您可能想读取特定的部分,例如此处的一个块,此处的一个块,跳转文件,读取特定位置的块等。

Then still fseek ing out those positions and fread ing the blocks won't have overheads worth mentioning. 然后仍然fseek荷兰国际集团出这些职位和fread荷兰国际集团块不会有值得一提的开销。 But it can be more convenient to use memory mapping to let the operating system or a library deal with stuff like memory allocation etc. It won't get the job done faster, though. 但是使用内存映射来让操作系统或库处理诸如内存分配之类的事情可能会更加方便。不过,这样做不会使工作更快地完成。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM