简体   繁体   中英

How to return large vector from class without copying data

I am writing a program in which a class has a data member that is a big std::vector (on the order of 100k - 1M items). Other classes need to be able to access this vector. At the moment I have a standard accessor function that returns the vector, but this returns a copy of the vector I believe. I think it would be more memory and time efficient to just return an iterator or a pointer to the first element. However, if I do this, how can one then use this pointer to run through the vector and know when to stop (ie where the vector ends)?

My code looks like this:

class MyClass
{
    private:
        std::vector<MyObj> objects_;
        //...

    public:
        std::vector<MyObj> getObjects() { return objects_; }
        //...
}

This question arises in another form for me when I want to run through (simulated) concatenated vectors. If I have a vector of MyClass , I'd like to be able to iterate over all of the contained object_ vectors. I know from this answer that boost::join does what I have in mind, but I think I need to return copies for it to work. Can I return a pointer to the vector and still retain the ability to iterate over it and others in succession?

To avoid the performance penalties, return references.

// Non-const version
std::vector<MyObj>& getObjects() { return objects_;}

// const version
std::vector<MyObj> const& getObjects() const { return objects_; }

However, before you make that change, you have to consider the downsides of exposing references to a member variable. It makes your class less flexible. You can't easily change objects_ to a different type of container if that made more sense without impacting all the users of the class.

Make your class act as a collection by delegating to the vector data member. Of course, you may need to revisit the code that consumes MyClass but, with the getObjects() commented out, the compiler will tell you where + most of the changes are likely to be on the line of

MyClass heapsOfThem;
// ...
// just delete the `getObjects()` *and use MyClass::iterator*
// instead of std::vector::iterator.
// for(std::vector<MyObj>::iterator it=
//    heapsOfThem.getObjects().begin()...
// )
for(MyClass::iterator it=heapsOfThem.begin()...)

Delegation code goes on the line of the below - once you fixed your calling code, you can change your mind what type (vector, list, set) to use as an internal container for your objects without changes in the calling code.

class MyClass
{
    private:
        std::vector<MyObj> objects_;
        //...

    public:


        const size_t size() const {
          return objects_,size();
        }
        MyObj& operator[](size_t i) {
          return objects_[i];
        }
        const MyObj& operator[](size_t i) const {
          return objects_[i];
        }

        using iterator = std::vector<MyObj>::iterator;
        iterator begin() {
          return objects_.begin();
        }
        iterator end() {
          return objects_.end();
        }
        // TODO const iterators following the same pattern

        // *if you aren't good enough with the above*
        // uncomment it and let it return a *reference* 
        // std::vector<MyObj>& getObjects() { return objects_; }
        //...
}

You can refactor the class to have public methods that return an element of the array, and the size of the array, so all other classes can fetch values, without any copying of the entire vector.

public:
    unsigned int getMyObjArraySize();
    MyObj getMyObjElementAt(unsigned int index);

With this approach, there is only one instance of the vector, but any collaborations can be done via the two public methods that expose the size and an accessor to values via the index.

This approach is geared to the use of for-loops and not iterators.

MyClass myClass;
// ...
MyObj myObj;

for(unsigned int i; i < myClass.getMyObjArraySize(); i++) {
    myObj = myClass.getMyObjElementAt(i);
    // do stuff
}

There's no problem returning a pointer to the vector.

std::vector<MyObj>* getObjects() { return &objects_; }

And then when want to iterate over it, just dereference:

std::vector<MyObj>* objectsPtr = getObjects();
for (auto& it : *objectsPtr)
{
   ...
}

However, make sure that you aren't writing to the vector while reading from it, since that would invalidate the iterator.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM