简体   繁体   English

如何从类中返回大向量而不复制数据

[英]How to return large vector from class without copying data

I am writing a program in which a class has a data member that is a big std::vector (on the order of 100k - 1M items). 我正在编写一个程序,其中一个类的数据成员是一个很大的std::vector (大约100k-1M个项目)。 Other classes need to be able to access this vector. 其他类需要能够访问此向量。 At the moment I have a standard accessor function that returns the vector, but this returns a copy of the vector I believe. 目前,我有一个返回向量的标准访问器函数,但这将返回我相信的向量的副本。 I think it would be more memory and time efficient to just return an iterator or a pointer to the first element. 我认为只返回一个迭代器或一个指向第一个元素的指针会节省更多的内存和时间。 However, if I do this, how can one then use this pointer to run through the vector and know when to stop (ie where the vector ends)? 但是,如果我这样做,那么如何使用该指针遍历vector并知道何时停止(即向量结束于何处)?

My code looks like this: 我的代码如下所示:

class MyClass
{
    private:
        std::vector<MyObj> objects_;
        //...

    public:
        std::vector<MyObj> getObjects() { return objects_; }
        //...
}

This question arises in another form for me when I want to run through (simulated) concatenated vectors. 当我想遍历(模拟)级联向量时,这个问题对我来说又是另一种形式。 If I have a vector of MyClass , I'd like to be able to iterate over all of the contained object_ vectors. 如果我有一个MyClassvector ,则希望能够遍历所有包含的object_向量。 I know from this answer that boost::join does what I have in mind, but I think I need to return copies for it to work. 我从这个答案中知道boost :: join确实可以实现我的想法,但是我认为我需要返回副本才能正常工作。 Can I return a pointer to the vector and still retain the ability to iterate over it and others in succession? 我可以返回一个指向该向量的指针,并且仍然保留连续迭代该向量和其他向量的能力吗?

To avoid the performance penalties, return references. 为避免性能下降,请返回引用。

// Non-const version
std::vector<MyObj>& getObjects() { return objects_;}

// const version
std::vector<MyObj> const& getObjects() const { return objects_; }

However, before you make that change, you have to consider the downsides of exposing references to a member variable. 但是,在进行更改之前,必须考虑将对成员变量的引用公开的弊端。 It makes your class less flexible. 它使您的课堂不太灵活。 You can't easily change objects_ to a different type of container if that made more sense without impacting all the users of the class. 如果在不影响类的所有用户的情况下更有意义,就不能轻易地将objects_更改为其他类型的容器。

Make your class act as a collection by delegating to the vector data member. 通过委托给vector数据成员,使您的类充当集合。 Of course, you may need to revisit the code that consumes MyClass but, with the getObjects() commented out, the compiler will tell you where + most of the changes are likely to be on the line of 当然,您可能需要重新访问消耗MyClass的代码,但是在注释掉getObjects()情况下,编译器会告诉您+大部分更改可能在哪里发生。

MyClass heapsOfThem;
// ...
// just delete the `getObjects()` *and use MyClass::iterator*
// instead of std::vector::iterator.
// for(std::vector<MyObj>::iterator it=
//    heapsOfThem.getObjects().begin()...
// )
for(MyClass::iterator it=heapsOfThem.begin()...)

Delegation code goes on the line of the below - once you fixed your calling code, you can change your mind what type (vector, list, set) to use as an internal container for your objects without changes in the calling code. 委托代码如下:固定了调用代码后,您可以改变主意,将哪种类型(向量,列表,集合)用作对象的内部容器,而无需更改调用代码。

class MyClass
{
    private:
        std::vector<MyObj> objects_;
        //...

    public:


        const size_t size() const {
          return objects_,size();
        }
        MyObj& operator[](size_t i) {
          return objects_[i];
        }
        const MyObj& operator[](size_t i) const {
          return objects_[i];
        }

        using iterator = std::vector<MyObj>::iterator;
        iterator begin() {
          return objects_.begin();
        }
        iterator end() {
          return objects_.end();
        }
        // TODO const iterators following the same pattern

        // *if you aren't good enough with the above*
        // uncomment it and let it return a *reference* 
        // std::vector<MyObj>& getObjects() { return objects_; }
        //...
}

You can refactor the class to have public methods that return an element of the array, and the size of the array, so all other classes can fetch values, without any copying of the entire vector. 您可以重构该类以使其具有返回数组元素和数组大小的公共方法,以便所有其他类都可以获取值,而无需复制整个向量。

public:
    unsigned int getMyObjArraySize();
    MyObj getMyObjElementAt(unsigned int index);

With this approach, there is only one instance of the vector, but any collaborations can be done via the two public methods that expose the size and an accessor to values via the index. 使用这种方法,向量只有一个实例,但是可以通过两种公开方法来进行任何协作,这两种公开方法通过索引公开大小和访问值的方法。

This approach is geared to the use of for-loops and not iterators. 这种方法适合于使用for循环而不是迭代器。

MyClass myClass;
// ...
MyObj myObj;

for(unsigned int i; i < myClass.getMyObjArraySize(); i++) {
    myObj = myClass.getMyObjElementAt(i);
    // do stuff
}

There's no problem returning a pointer to the vector. 返回指向向量的指针没有问题。

std::vector<MyObj>* getObjects() { return &objects_; }

And then when want to iterate over it, just dereference: 然后,当要对其进行迭代时,只需取消引用即可:

std::vector<MyObj>* objectsPtr = getObjects();
for (auto& it : *objectsPtr)
{
   ...
}

However, make sure that you aren't writing to the vector while reading from it, since that would invalidate the iterator. 但是,请确保在读取向量时未写入向量,因为这会使迭代器无效。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM