std::vector<std::vector<T> > vs std::vector<T*>

Question

Given that memory overhead is critical to my application, I think that of the two options above, the latter would be more light weight. Am I correct? I am basing this on the fact that the vector has a memory overhead of 4 pointers to keep track of begin() , end() , size() and allocator. So the total size for the whole model would be in the order of

(4*sizeof(T*) + sizeof(T)*Ni)*No + 3*sizeof(std::vector<T>*)

Here, I am assuming Ni , No to be the number of elements in the inner and outer vectors, resply. By using the latter expression, I am hoping to save the 4*sizeof(T*)*No since in my application, No is huge, while Ni <<<< No . Just to fix ideas, No is in the order of a 100 million and more, Ni is typically in the order 3 to 50 .

Thanks in advance for your interest and any ideas.

NOTE: I understand and am more than happy to pay the price of dealing with the pointer incl. allocating, traversing, and deallocating it, and I can do so without any significant performance overhead.

Answer 1

It's actually 4, you missed the allocator. See What is the overhead cost of an empty vector?

Depends on your application. Do you never append to the internal vectors? Do they all have the same number of elements? Is the average size of the data stored in the internal vectors small?

If you answered yes to all the questions above than maybe T* is an acceptable solution. If not think about how would you handle that issue without the support of vector. It might be easier to just take the hit on memory.

Answer 2

As you see here , the exact overhead of an std::vector is implementation dependent.

Also note that if No is very large, it's very probable that your data will be stored in chunks in some implementations, in which case, you also have the overhead which is of the order of the number of chunks.

But in general I agree that the pointer implementation is cheaper space-wise.

Answer 3

I think that [the vector<T*> would be better. Am I correct?

It would be smaller, but it wouldn't necessarily be "better". The change would saddle you with the necessity to allocate and free inner arrays. You would no longer have a way of knowing the size of the inner array.

Also note that some overhead on size would remain: as long as your inner arrays are allocated individually, there would be some additional storage reserved by the allocator in addition to the size of requested chunk, to let the deallocation routines know the size of the chunk.

If your memory requirements are so tight, consider allocating one vector for the whole array, and then parcel out the individual chunks into a vector of pointers. This would eliminate the per-chunk overhead of allocating the inner arrays indivudually.

Answer 4

If you are concerned about the overhead of a vector, you should also be concerned about the overhead of malloc() / new : typical memory allocator overhead is at least two more pointers per memory region, that brings the overhead of a small vector<> up to five pointers ( sizeof(vector<int>) == 3*sizeof(void*) on linux).

So, what I would do, is to ask myself whether the size of the inner arrays needs to change once they have been initialized. If it is possible to avoid later reallocation of those arrays, I would allocate one huge chunk of memory, which I can then distribute to the different inner arrays, storing only their location:

int** pointerArray = new int*[innerArrayCount + 1];
int* store = new int[totalSizeOfInnerArrays];
for(int* nextArray = store, i = 0; i <= innerArrayCount; i++) {
    pointerArray[i] = nextArray;
    nextArray += innerArraySize[i];
}

The size of an array can then be deduced from the difference of the next pointer and its own:

for(int i = 0; i < innerArrayCount; i++) {
    int* curArray = pointerArray[i];
    size_t curSize = pointerArray[i + 1] - pointerArray[i];
    //Do whatever you like with curArray.
}

Or, you can directly use that end pointer for iterating over the inner arrays:

for(int i = 0; i < innerArrayCount; i++) {
    for(int* iterator = pointerArray[i]; iterator < pointerArray[i + 1]; iterator++) {
        //Do whatever you like with *iterator.
    }
}

std::vector<std::vector<T> > vs std::vector<T*>

Question

4 answers

solution1
2 ACCPTED 2014-01-09 15:17:06

solution2
1 2014-01-09 15:20:39

solution3
1 2014-01-09 15:25:15

solution4
1 2014-01-09 15:52:34

std::vector<std::vector<T> > vs std::vector<T*>

Question

4 answers

solution1 2 ACCPTED 2014-01-09 15:17:06

solution2 1 2014-01-09 15:20:39

solution3 1 2014-01-09 15:25:15

solution4 1 2014-01-09 15:52:34

solution1
2 ACCPTED 2014-01-09 15:17:06

solution2
1 2014-01-09 15:20:39

solution3
1 2014-01-09 15:25:15

solution4
1 2014-01-09 15:52:34