简体繁体中英

What is the most efficient way to read formatted data from a large file?

原文 2013-03-18 23:40:58 4 3 c++/ c/ file/ input/ io

Options: 1. Reading the whole file into one huge buffer and parsing it afterwards. 2. Mapping the file to virtual memory. 3. Reading the file in chunks and parsing them one by one.

The file can contain quite arbitrary data but it's mostly numbers, values, strings and so on formatted in certain ways (commas, brackets, quotations, etc). Which option would give me greatest overall performance?

3 answers

如果文件非常大 ，那么您可以考虑使用带有选项2或3的多个线程。每个线程可以处理单个文件/内存块，您可以通过这种方式重叠IO和计算（解析）。

It's hard to give a general answer to your question as choosing the "right" strategy heavily depends on the organization of the data you are reading.

Especially if there's a really huge amount of data to be processed options 1. and 2. won't work anyways as the available amount of main memory poses an upper limit to any attempt like this.

Most probably the biggest gain in terms of efficiency can be accomplished by (re)structuring the data you are going to process.

Checking if there is any chance to organize the data in a way to save from needlessly processing whole chunks would be the primary spot I'd try to improve upon before addressing the problem mentioned in the question.

In terms of efficiency there's nothing but a constant to win in choosing any of the mentioned methods while on the other hand there might be much better improvement with the right organization of your data. The bigger the data the more important your decision will get.

Some facts about the data that seem interesting enough to take into consideration include:

Is there any regular pattern to the data you are going to process ?
Is the data mostly static or highly dynamic?
Does it have to be parsed sequentially or is it possible to process data in parallel?

It makes no sense to read the entire file all at once and then convert from text to binary data; it's more convenient to write, but you run out of memory faster. I would read the text in chunks and convert as you go. The converted data, in binary format instead of text, will likely take up less space than the original source text anyway.

What is the most efficient way to read from the end of a file in c++? (Parsing last 128 bits in a file)

What's the most efficient way to read a file into a std::string?

What is the most efficient way to read integer 2D array with unknown size from .txt file?

Most efficient way to read fragmented binary data from file and split it into several vectors. C++

What is the most efficient way to find multiple sums from a file?

An efficient way to read/write a large scene file

Qt: what is the most efficient way to vizualize a large 2D array?

Most efficient way to parse every fourth line from a very large file

Most memory efficient way to transpose a large file in C++

Most efficient way to initialize class member vector with large data set

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question What is the most efficient way to read from the end of a file in c++? (Parsing last 128 bits in a file) What's the most efficient way to read a file into a std::string? What is the most efficient way to read integer 2D array with unknown size from .txt file? Most efficient way to read fragmented binary data from file and split it into several vectors. C++ What is the most efficient way to find multiple sums from a file? An efficient way to read/write a large scene file Qt: what is the most efficient way to vizualize a large 2D array? Most efficient way to parse every fourth line from a very large file Most memory efficient way to transpose a large file in C++ Most efficient way to initialize class member vector with large data set

Related Tags

What is the most efficient way to read formatted data from a large file?

Question

3 answers

solution1
2 2013-03-18 23:49:08

solution2
0 2013-03-19 00:00:12

solution3
0 2013-03-19 00:15:45

What is the most efficient way to read formatted data from a large file?

Question

3 answers

solution1 2 2013-03-18 23:49:08

solution2 0 2013-03-19 00:00:12

solution3 0 2013-03-19 00:15:45

solution1
2 2013-03-18 23:49:08

solution2
0 2013-03-19 00:00:12

solution3
0 2013-03-19 00:15:45