读取txt文件并使用c ++和boost内存映射文件快速解析它们

Question

Important edit : The problem is not what i stated, after manually profiling i understood that when i replace the line : "file >> x >> y >> z;" 重要编辑：问题不是我说的，手动分析后我明白当我更换线时：“file >> x >> y >> z;” with the line "file.readline(buffer, size);" 使用“file.readline（buffer，size）;”行

it takes only 0.4 seconds, so the question is entirely different, how to parse the floats from the line, file>>x>>y>>z; 它只需要0.4秒，所以问题完全不同，如何从行中解析浮点数，文件>> x >> y >> z;

(i don't know if i should delete the question or not, because the original question is not relevant) （我不知道我是否应该删除这个问题，因为原来的问题不相关）

=== OLD === After vast research on the internet and stack overflow, i understood that the best way to read large files with c++ is by using memory mapped files. === OLD ===经过对互联网和堆栈溢出的大量研究后，我明白用c ++读取大文件的最佳方法是使用内存映射文件。

I have a txt file, 15MB that on each line has 3 float separated by spaces. 我有一个txt文件，每行15MB，有3个以空格分隔的浮点数。

I had this code : 我有这个代码：

ifstream file(path)
float x,y,z;
while(!file.eof())
  file >> x >> y >> z;

Which could read this file in 9.5 seconds. 哪个可以在9.5秒内读取此文件。

In order to read the file faster using stackoverflow users i came up with this code, that if i understand it correctly uses memory mapped files and should read it faster Stream types in C++, how to read from IstringStream? 为了使用stackoverflow用户更快地读取文件我想出了这个代码，如果我理解它正确使用内存映射文件并且应该更快地读取C ++中的Stream类型，如何从IstringStream中读取？

#include <iostream>
#include <boost/iostreams/stream.hpp>
#include <boost/iostreams/device/mapped_file.hpp>
namespace io = boost::iostreams;

int main()
{
    io::stream<io::mapped_file_source> str("test.txt");
    // you can read from str like from any stream, str >> x >> y >> z
    for(float x,y,z; str >> x >> y >> z; )
        std::cout << "Reading from file: " << x << " " << y << " " << z << '\n';
}

Unfortunately the speed remains the same, still 9.5 seconds. 不幸的是速度保持不变，仍然是9.5秒。

Any suggestions ? 有什么建议么？ Thanks 谢谢

Answer 1

Streams are slow. 流很慢。 Part is because the constraints that apply to them are onerous, part is because implementations have a tendency of being poorly optimized. 部分是因为适用于它们的约束是繁重的，部分原因是实现具有不良优化的趋势。

Try using Boost.Spirit parsers. 尝试使用Boost.Spirit解析器。 While the syntax takes a bit of getting used to and compilation can sometimes be very slow, the runtime performance of Spirit is very high. 虽然语法需要一些习惯，编译有时可能非常慢，但Spirit的运行时性能非常高。

读取txt文件并使用c ++和boost内存映射文件快速解析它们

问题描述

1 个解决方案

解决方案1
2 2013-07-03 14:48:07

读取txt文件并使用c ++和boost内存映射文件快速解析它们

问题描述

1 个解决方案

解决方案1 2 2013-07-03 14:48:07

解决方案1
2 2013-07-03 14:48:07