简体   繁体   English

读取txt文件并使用c ++和boost内存映射文件快速解析它们

[英]Reading txt files and parsing them fast using c++ and boost memory mapped files

Important edit : The problem is not what i stated, after manually profiling i understood that when i replace the line : "file >> x >> y >> z;" 重要编辑:问题不是我说的,手动分析后我明白当我更换线时:“file >> x >> y >> z;” with the line "file.readline(buffer, size);" 使用“file.readline(buffer,size);”行

it takes only 0.4 seconds, so the question is entirely different, how to parse the floats from the line, file>>x>>y>>z; 它只需要0.4秒,所以问题完全不同,如何从行中解析浮点数,文件>> x >> y >> z;

(i don't know if i should delete the question or not, because the original question is not relevant) (我不知道我是否应该删除这个问题,因为原来的问题不相关)

=== OLD === After vast research on the internet and stack overflow, i understood that the best way to read large files with c++ is by using memory mapped files. === OLD ===经过对互联网和堆栈溢出的大量研究后,我明白用c ++读取大文件的最佳方法是使用内存映射文件。

I have a txt file, 15MB that on each line has 3 float separated by spaces. 我有一个txt文件,每行15MB,有3个以空格分隔的浮点数。

I had this code : 我有这个代码:

ifstream file(path)
float x,y,z;
while(!file.eof())
  file >> x >> y >> z;

Which could read this file in 9.5 seconds. 哪个可以在9.5秒内读取此文件。

In order to read the file faster using stackoverflow users i came up with this code, that if i understand it correctly uses memory mapped files and should read it faster Stream types in C++, how to read from IstringStream? 为了使用stackoverflow用户更快地读取文件我想出了这个代码,如果我理解它正确使用内存映射文件并且应该更快地读取C ++中的Stream类型,如何从IstringStream中读取?

#include <iostream>
#include <boost/iostreams/stream.hpp>
#include <boost/iostreams/device/mapped_file.hpp>
namespace io = boost::iostreams;

int main()
{
    io::stream<io::mapped_file_source> str("test.txt");
    // you can read from str like from any stream, str >> x >> y >> z
    for(float x,y,z; str >> x >> y >> z; )
        std::cout << "Reading from file: " << x << " " << y << " " << z << '\n';
}

Unfortunately the speed remains the same, still 9.5 seconds. 不幸的是速度保持不变,仍然是9.5秒。

Any suggestions ? 有什么建议么 ? Thanks 谢谢

Streams are slow. 流很慢。 Part is because the constraints that apply to them are onerous, part is because implementations have a tendency of being poorly optimized. 部分是因为适用于它们的约束是繁重的,部分原因是实现具有不良优化的趋势。

Try using Boost.Spirit parsers. 尝试使用Boost.Spirit解析器。 While the syntax takes a bit of getting used to and compilation can sometimes be very slow, the runtime performance of Spirit is very high. 虽然语法需要一些习惯,编译有时可能非常慢,但Spirit的运行时性能非常高。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM