简体   繁体   中英

cpp use istream_iterator to populate a vector, avoid copy

this question seems to be quite similar to others around here, but some crucial details are different in this case (they always copy the element of the iterator to the vector as far as I've seen posts).

Sticking with a quite common idiom (as far as I read), I'm doing file -> ifstream -> istream_iterator -> vector (this approach calls >> on type of the vector). The problem I've got with this is that according to the reference , istream_iterator

Dereferencing only returns a copy of the most recently read object

In my case I want to read some objects containing a vector-member, which means that the currently read object is copied to insert it into the vector (which means to copy construct the whole vector-member) and immediately afterwards it is constructed completely new when reading the new object.

So this copy is unneeded and just slowing down the whole thing. Is there a way to eliminate this uneeded copy (I imagine, the istream_iterator might just be move ing the object out and construct a new one afterwards, but maybe there is another way to avoid this copying)?

For illustration of the problem see some example code:

#include <fstream>
#include <iostream>
#include <iterator>
#include <vector>

struct My_class {
  std::vector<std::string> vec;

  // default/copy/move constructors to illustrate what constructors are being invoked
  My_class() : vec{} {
      std::cout << " default" << std::endl;
  }
  My_class(My_class const &other) : vec{other.vec} {
      std::cout << " copy" << std::endl;
  }
  My_class(My_class &&other) noexcept : vec{std::move(other.vec)} {
      std::cout << " move" << std::endl;
  }


  friend std::istream& operator>>(std::istream& is, my_class& i) {
    i.vec.clear(); // needed since i still contains the data from the previous object read
    std::string l;
    // in other use cases some more lines would be read or maybe the lines get preprocessed (e.g. split it up) and then inserted into the vector
    std::getline(is, l);
    i.vec.push_back(l);
    return is;
  }
};

int main(int argc, char* argv[]) {
  std::ifstream ifs{argv[1]};
  std::vector<my_class> data{std::istream_iterator<my_class>{ifs}, {}};

  for(const auto& _ele : data){
      for(const auto& ele : _ele.vec){
          std::cout << ele << " ";
      }
      std::cout << std::endl;
  }
}

Second try:

#include <fstream>
#include <iostream>
#include <iterator>
#include <vector>
#include <memory>
#include <ranges>

static int cpy_cnt = 0;

struct My_class {
    std::vector<std::string> vec;

  // default constructor
  My_class() : vec{} {
    std::cout << vec.size() << " default" << std::endl;
  }

  My_class(My_class&&)      = default;
  My_class(const My_class&) = default;

  My_class& operator=(My_class&&) = default;
  My_class& operator=(My_class&)  = default;

  friend std::istream& operator>>(std::istream& is, My_class& i) {
    i.vec.clear();
    std::string l;
    std::getline(is, l);
    i.vec.push_back(l);
    return is;
  }
};

int main(int argc, char* argv[]) {
  std::ifstream ifs{argv[1]};
  std::cout << "iter create" << std::endl;
  auto tmp = std::views::istream<My_class>(ifs);
  std::cout << "vec create" << std::endl;
  std::vector<My_class> data2{};
}

gives an error containing error: no matching function for call to '__begin' (not gonna paste the complete long error message of clang).

I've been fiddling around with gcc and clang and it seams, that the error only occurs with clang while with gcc everything is just fine.

Configuring with cd build && cmake -DCMAKE_BUILD_TYPE=Debug -D CMAKE_C_COMPILER=clang -D CMAKE_CXX_COMPILER=clang++ -DCMAKE_EXPORT_COMPILE_COMMANDS=1.. and cd build && cmake -DCMAKE_BUILD_TYPE=Debug -D CMAKE_C_COMPILER=gcc -D CMAKE_CXX_COMPILER=g++ -DCMAKE_EXPORT_COMPILE_COMMANDS=1..

$ g++ --version
g++ (GCC) 12.2.0
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ clang++ --version
clang version 14.0.6
Target: x86_64-pc-linux-gnu
Thread model: posix

Like you have found out, that derefencing istream_iterator will give you a copy. One workaround for you is to use views::istream and ranges::move :

std::ifstream ifs{argv[1]};
auto ifs_view = std::views::istream<My_class>(ifs);

std::vector<My_class> data;
std::ranges::move(ifs_view, std::back_inserter(data));

Depends on your use case, you can also loop on ifs_view directly to avoid any constructions(since your operator >> insert data directly to the underlying vector):

for(const auto& _ele : ifs_view) { 
   // for loop logics here...
}

Sidenote, istream_view requires the object to be assignable, and your My_class 's assigning operators are implicitly deleted right now, so you would need to either provide at least one yourself, or just default the copy/move constructors.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM