简体   繁体   English

c++读取文件太慢

[英]c++ reading file is too slow

I'm trying to to read ~36KB and it would take ~20 seconds to finish this loop:我正在尝试读取 ~36KB,完成此循环需要 ~20 秒:

ifstream input_file;

input_file.open("text.txt");
if( !(input_file.is_open()) )
{
    cout<<"File not found";
    exit(1);
}

std::string line;
stringstream line_stream;   //to use << operator to get words from lines

int lineNum=1;

while( getline(input_file,line) )   //Read file line by line until file ends
{
    line_stream.clear();    //clear stream
    line_stream << line;    //read line
    while(line_stream >> word)  //Read the line word by word until the line ends
    {
        //insert word into a linked list...
    }
    lineNum++;
}
input_file.close();

Any help would be appreciated.任何帮助将不胜感激。

stringstream::clear() does not clear all context inside it. stringstream::clear()不会清除其中的所有上下文。 It only resets the error and EOF flags, see http://en.cppreference.com/w/cpp/io/basic_ios/clear .它只重置错误和 EOF 标志,请参阅http://en.cppreference.com/w/cpp/io/basic_ios/clear

The result is your line_stream accumulates all previous lines and the inner loop will run words over all the accumulated lines again and again.结果是您的line_stream累积了所有先前的行,并且内部循环将一次又一次地在所有累积的行上运行单词。

So the total time you spend is about O(n^2) compared to O(n) of what you expect it to be.因此,与您期望的 O(n) 相比,您花费的总时间约为 O(n^2)。

Instead of using the same object across each line, you could define the new line_stream instance inside the while loop to have a brand new and also empty one.您可以在 while 循环内定义新的line_stream实例,而不是在每一行中使用相同的对象,以拥有一个全新且空的实例。 Like this:像这样:

fstream input_file;

input_file.open("text.txt");
if( !(input_file.is_open()) )
{
    cout<<"File not found";
    exit(1);
}

std::string line;

int lineNum=1;

while( getline(input_file,line) )   //Read file line by line until file ends
{
    stringstream line_stream;   // new instance, empty line.
    line_stream << line;    //read line
    while(line_stream >> word)  //Read the line word by word until the line ends
    {
        //insert word into a linked list...
    }
    lineNum++;
}
input_file.close();

You could attempt the following:您可以尝试以下操作:

std::ifstream file("text.txt");
std::string str;

while (std::getline(file, str))
{
    cout << str; //call function to to retrieve words of str in memory not in file 
}

I ran your code in 11ms, but with the mentioned option in 8ms.我在 11 毫秒内运行了您的代码,但在 8 毫秒内使用了上述选项。 May be it works for you.也许它对你有用。

Try compiling with build flag -O2 or -O3 .尝试使用构建标志-O2-O3编译。

I was surprised to see that a simple for-loop to read a 1GB file took 4.7 seconds, whereas another higher level language (Dart) did it in 3.x seconds.我很惊讶地看到一个简单的 for 循环读取 1GB 文件需要 4.7 秒,而另一种高级语言(Dart)在 3.x 秒内完成。

After enabling this flag, runtime dropped to 2.1 seconds.启用此标志后,运行时间下降到 2.1 秒。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM