简体繁体 English

从文件读取或将文件读入缓冲区然后使用缓冲区（在C ++中）？

[英]Read from a file or read the file into a buffer and then use the buffer(in C++)?

原文 2011-10-10 01:13:43 5 4 c++/ inputstream

I am writing a parser wherein, I need to read characters from a file. 我正在写一个解析器，其中，我需要从文件中读取字符。 But I will be reading the file character by character, and may even stop reading in the middle if come conditions do not satisy. 但是我将逐字逐句地阅读文件，如果条件不满意，甚至可能会在中间停止阅读。

So is it advisable to create an ifstream of the file, and seek to the position everytime and start reading from there, Or should I read the entire file into a stream or buffer, and then use that instead?? 那么建议创建一个文件的ifstream，并且每次都找到该位置并从那里开始读取，或者我应该将整个文件读入流或缓冲区，然后使用它？

4 个解决方案

If you can, use a memory-mapped file. 如果可以，请使用内存映射文件。

Boost offers a cross-platform one: http://www.boost.org/doc/libs/1_35_0/libs/iostreams/doc/classes/mapped_file.html Boost提供跨平台的： http ： //www.boost.org/doc/libs/1_35_0/libs/iostreams/doc/classes/mapped_file.html

How big is the file? 文件有多大？ Do you make more than one pass? 你做不止一次传球吗？ Whether you read it into an in-memory buffer or not, reading the file will consume (file size/ BUFSIZ ) reads to go through the whole thing. 无论您是否将其读入内存缓冲区，读取文件将消耗（文件大小/ BUFSIZ ）读取以完成整个过程。 Reading character by character doesn't matter, because the underlying read still consumes BUFSIZ bytes at a time (unless you take steps to change that behavior) -- it just hands them out character-by-character. 逐个字符读取并不重要，因为底层读取仍然一次消耗BUFSIZ字节（除非您采取措施来改变该行为） - 它只是逐个字符地将它们移出。

If you're reading it and processing it in one pass anyway, then reading it into memory will mean you always need (file size/ BUFSIZ ) reads, where -- assuming the reason for stopping is distributed equiprobably -- reading it and processing in line will take on average (file size/ BUFSIZ ) * 0.5 reads, which on a big file could be a substantial gain. 如果你正在读它并一次性处理它，那么将它读入内存将意味着你总是需要（文件大小/ BUFSIZ ）读取，其中 - 假设停止的原因是等量分布的 - 读取它并处理line将取平均值（文件大小/ BUFSIZ ）* 0.5读取，这在一个大文件上可能是一个实质性的收益。

An even more important question might be "what are you doing looking for this complicated a solution?" 一个更重要的问题可能是“你正在寻找这个复杂的解决方案？” The amount of time it takes to figure out the cute solution probably dominates any gains you'll make from looking for something fancier than the standard "while not end of file, get character and process" solution. 找出可爱解决方案所花费的时间可能会占据你从寻找比标准“更好而不是文件结束，获得角色和流程”解决方案更精彩的东西所获得的任何收益。

Seeking the position every time and reading wouldn't be a better option for this as it degrades the performance, Try creating a Buffer and read from that that would be a better idea and more efficient 每次寻找位置并且阅读不会是更好的选择，因为它会降低性能， 尝试创建缓冲区并从中读取这将是一个更好的主意和更高效

Try to read all the file contents at a stretch to the buffer and then process the subsequent input needs with the buffer and without reading from the file everytime,, 尝试一直读取缓冲区中的所有文件内容，然后使用缓冲区处理后续输入需求，并且每次都不读取文件，

On a full service OS (ie Windows, Mac OS, Linux, BSD...) the operating system will have a caching mechanism that handles this for you to some extent (and assuming your usage patterns meet some definition of "usual"). 在完整的服务操作系统（即Windows，Mac OS，Linux，BSD ...）上，操作系统将具有一种缓存机制，可以在某种程度上为您处理（并假设您的使用模式符合“常规”的某些定义）。

Unless you are experiencing unacceptable performance you might want to merrily ignore the whole issue (ie just use the naive file access primitives). 除非您遇到不可接受的性能，否则您可能希望快乐地忽略整个问题（即只使用天真的文件访问原语）。