简体   繁体   中英

Most efficient way to read lines from text file to std::vector<string>

The common way to add lines extracted from a text file into a std::vector< std::string > where every element of vector correspond to a file's line is something like these example:

https://stackoverflow.com/a/8365024/7030542

std::string line;
std::vector<std::string> myLines;
while (std::getline(myfile, line))
{
    myLines.push_back(line);
}

or also

https://stackoverflow.com/a/12506764/7030542

std::vector<std::string> lines;
for (std::string line; std::getline( ifs, line ); /**/ )
    lines.push_back( line );

Does exist a most efficient way to do that like avoid the auxiliary string?

Don't overthink it:

std::vector<std::string> lines;
std::string line;
while(std::getline( ifs, line ))
    lines.push_back(std::move(line));

Note that the moved from line is in a valid, but indeterminate state, so calling std::getline is fine because that will replace the std::string 's contents (whatever they may be) completely, eradicating any indeterminate state that was left behind by the move .

@rubenvb's answer is great.

As an alternative

bool get_line_into_vector( std::istream& is, std::vector<std::string>& v ) {
  std::string tmp;
  if (!std::getline(is, tmp))
     return false;
  v.push_back(std::move(tmp));
  return true;
}

std::vector<std::string> lines;
while(get_line_into_vector( ifs, lines ))
{} // do nothing

This is rubenvb's solution with the temporary moved into a helper function.

We can avoid the small buffer optimization sized copies of characters with this:

bool get_line_into_vector( std::istream& is, std::vector<std::string>& v ) {
  v.emplace_back();
  if (std::getline(is, v.back()))
    return true;
  v.pop_back();
  return false;
}

this can (in an edge case) cause an extra massive reallocation, but that is asymptotically rare.

Unlike @pschill's answer, here the invalid states are isolated to within a helper function, and all the flow control is centered around avoiding those invalid states from leaking.

The nice thing is that

std::vector<std::string> lines;
while(get_line_into_vector( ifs, lines ))
{} // do nothing

is how you use it; which of these two implementations you use is isolated to within the get_line_into_vector function. That lets you swap between them and determine which is better.

If you want to avoid temporary variables, you can use the last vector element as buffer:

std::vector<std::string> lines(1);
while (std::getline(ifs, lines.back())
    lines.emplace_back();
lines.erase(--lines.end());  // remove the buffer element

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM