[英]How to correctly read from already opened std::ifstream using a buffer
I implement a JSON parser and offer an operator>>
function to parse from an std::ifstream
. 我实现了一个JSON解析器,并提供了一个
operator>>
函数来解析std::ifstream
。 To speed up reading, I copy 16 KB into a buffer and let my parser read from the buffer. 为了加快读取速度,我将16 KB复制到一个缓冲区中,让我的解析器从缓冲区中读取。 A small benchmark showed that this is faster than directly working with
std::ifstream::get
or std::ifstream::read
. 一个小的基准测试显示,这比直接使用
std::ifstream::get
或std::ifstream::read
更快。
When I successfully read a JSON value, I want to "put back" all unneccessary bytes from the buffer to the stream so a subsequent call of operator>>
with the same std::istream
continues parsing right where the first call ended. 当我成功读取JSON值时,我想将所有不需要的字节从缓冲区“放回”到流中,因此后续调用
operator>>
并使用相同的std::istream
继续解析第一个调用结束的位置。 I currently implement this "putting back" like this: 我目前正在实施这样的“退回”:
is.clear();
is.seekg(start_position + static_cast<std::streamoff>(processed_chars));
is.clear();
Thereby, is
is the input file stream, start_position
is the initial value of is.tellg()
, and processed_chars
the number of characters read by the parser. 因此,
is
是输入文件流, start_position
是is.tellg()
的初始值,并且processed_chars
是解析器读取的字符数。
This works with GCC and Clang with OSX and Linux, but MSVC 2015 and MSVC 2017 fail to bring the input stream into the desired state. 这适用于GCC和Clang与OSX和Linux,但MSVC 2015和MSVC 2017无法将输入流带入所需状态。
Apparently, I am doing something wrong here. 显然,我在这里做错了什么。 The different compilers should not behave so differently.
不同的编译器不应该表现得如此不同。 The
clear()
calls are already the result of trial&error to make the code run with GCC/Clang. clear()
调用已经是试验和错误的结果,使代码与GCC / Clang一起运行。
What would be the correct way to (a) read from an already opened std::ifstream
using a cache and (b) be able to resume parsing after the last processed character (instead after the last cached character)? (a)使用缓存从已打开的
std::ifstream
读取和(b)能够在最后处理的字符之后(而不是在最后一个缓存的字符之后)恢复解析的正确方法是什么?
Is there a better way to quickly read from an already opened std::ifstream
? 有没有更好的方法快速读取已经打开的
std::ifstream
? As I mentioned above, removing the cache makes the parser slower. 如上所述,删除缓存会使解析器变慢。
(Apologies for the naive question and the horrible implementation! I did not find an answer on this that coped with an already open std::ifstream
or that could "put back" already cached characters.) (对于天真的问题和可怕的实现道歉!我没有找到答案,处理已经打开的
std::ifstream
或者可以“放回”已经缓存的字符。)
If you open a file stream in text mode, this is not valid: 如果以文本模式打开文件流,则无效:
is.seekg(start_position + static_cast<std::streamoff>(processed_chars));
...because according to the standard, seekg
/ tellg
are not directly related to the number of processed chars (this is actually OS-dependent). ...因为根据标准,
seekg
/ tellg
与处理的字符数量没有直接关系(这实际上与操作系统有关)。
Here are possible options for you (cannot give more details with what you gave in your question): 以下是可能的选项(无法提供您在问题中提供的更多详细信息):
putback
to put back the character you read but did not use; putback
来放回你读过但未使用的字符; tellg
to get the correct position. tellg
来获得正确的位置。 Something like this maybe: 这样的事情可能是:
// is is the istream
auto tg = is.tellg();
is.read(buffer, BUFFER_SIZE);
// process...
is.seekg(tg); // valid
is.ignore(processed_chars);
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.