[英]Why is ifstream::read much faster than using iterators?
As it is, there are many approaches to reading a file into a string. 实际上,有许多方法可以将文件读取为字符串。 Two common ones are using ifstream::read to read directly to a string and using steambuf_iterators along with std::copy_n:
两种常见的方法是使用ifstream :: read直接读取字符串,并使用steambuf_iterators和std :: copy_n:
Using ifstream::read: 使用ifstream :: read:
std::ifstream in {"./filename.txt"};
std::string contents;
in.seekg(0, in.end);
contents.resize(in.tellg());
in.seekg(0, in.beg);
in.read(&contents[0], contents.size());
Using std::copy_n: 使用std :: copy_n:
std::ifstream in {"./filename.txt"};
std::string contents;
in.seekg(0, in.end);
contents.resize(in.tellg());
in.seekg(0, in.beg);
std::copy_n(std::streambuf_iterator<char>(in),
contents.size(),
contents.begin();
Many benchmarks show that the first approach is much faster than the second one (in my machine using g++-4.9 it is about 10 times faster with both -O2 and -O3 flags) and I was wondering what may be the reason for this difference in performance. 许多基准测试表明,第一种方法比第二种方法要快得多(在我的机器上使用g ++-4.9的情况下,同时使用-O2和-O3标志,速度大约要快10倍),我想知道造成这种差异的原因是什么?性能。
read
is a single iostream setup (part of every iostream operation) and a single call to the OS, reading directly into the buffer you provided. read
是单个iostream设置(每个iostream操作的一部分)和对操作系统的单个调用,直接读入您提供的缓冲区。
The iterator works by repeatedly extracting a single char
with operator>>
. 迭代器通过使用
operator>>
重复提取单个char
来工作。 Because of the buffer size, this might mean more OS calls, but more importantly it also means repeated setting up and tearing down of the iostream sentry, which might mean a mutex lock, and usually means a bunch of other stuff. 由于缓冲区的大小,这可能意味着需要进行更多的OS调用,但是更重要的是,这还意味着需要反复设置和拆除iostream哨兵,这可能意味着互斥锁,并且通常还意味着很多其他东西。 Furthermore,
operator>>
is a formatted operation, whereas read
is unformatted, which is additional setup overhead on every operation. 此外,
operator>>
是格式化的操作,而read
是未格式化的,这是每个操作的额外设置开销。
Edit: Tired eyes saw istream_iterator instead of istreambuf_iterator. 编辑:疲倦的眼睛看到了istream_iterator而不是istreambuf_iterator。 Of course istreambuf_iterator does not do formatted input.
当然,istreambuf_iterator不执行格式化输入。 It calls sbumpc or something like that on the streambuf.
它在流缓冲上调用sbumpc或类似的名称。 Still a lot of calls, and using the buffer, which is probably smaller than the entire file.
仍然有很多调用,并使用了缓冲区,该缓冲区可能比整个文件小。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.