简体   繁体   English

std :: vector如何支持未知大小的自定义对象的连续内存

[英]How does std::vector support contiguous memory for custom objects of unknown size

I'm struggling with the correct mental model and understanding of std::vector . 我正在为正确的思维模式和对std::vector理解而苦苦挣扎。

What I thought I knew 我以为我知道

When you create a vector of type T and then reserve N elements for the vector, the compiler basically finds and reserves a contiguous block of memory that is N * sizeof(T) bytes. 当创建类型T的向量,然后为该向量保留N个元素时,编译器基本上会找到并保留一个连续的内存块,即N * sizeof(T)个字节。 For example, 例如,

// Initialize a vector of int
std::vector<int> intvec;

// Reserve contigious block of 4 4-byte chunks of memory
intvec.reserve(4);  // [ | | | ]

// Filling in the memory chunks has obvious behavior:
intvec.push_back(1);  // [1| | | ]
intvec.push_back(2);  // [1|2| | ]

Then we can access any element in random access time because, if we ask for the kth element of the vector, we simply start at the memory address of the start of the vector and then "jump" k * sizeof(T) bytes to get to the kth element. 然后,我们可以在随机访问时间内访问任何元素,因为如果我们要求向量的第k个元素,我们只需从向量开始处的内存地址开始,然后“跳转” k * sizeof(T)字节即可获得到第k个元素。

Custom Objects 自定义对象

My mental model breaks down for custom objects of unknown/varying size. 我的思维模型针对大小未知/不同的自定义对象进行了分解。 For example, 例如,

class Foo {

public:
    Foo() = default;
    Foo(std::vector<int> vec): _vec{vec} {}

private:
    std::vector<int> _vec;
};

int main() {

    // Initialize a vector Foo
    std::vector<Foo> foovec;

    // Reserve contigious block of 4 ?-byte chunks of memory
    foovec.reserve(4);  // [ | | | ]

    // How does memory allocation work since object sizes are unkown?
    foovec.emplace_back(std::vector<int> {1,2});        // [{1,2}| | | ]
    foovec.emplace_back(std::vector<int> {1,2,3,4,5});  // [{1,2}|{1,2,3,4,5}| | ]

    return 0;
}

Since we don't know the size of each instance of Foo, how does foovec.reserve() allocate memory? 由于我们不知道Foo每个实例的大小,因此foovec.reserve()如何分配内存? Furthermore, how could you achieve random access time we don't know how far to "jump" to get to the kth element? 此外,您如何获得随机访问时间,而我们不知道要“跳”多远才能到达第k个元素?

Your concept of size is flawed. 您的尺寸概念有缺陷。 A std::vector<type> has a compile time known size of space it is going to take up. 一个std::vector<type>具有一个已知的编译时要占用的空间大小。 It also has a run time size that it may use (this is allocated at run time and the vector holds a pointer to it). 它还具有可以使用的运行时大小(在运行时分配,并且向量保存指向它的指针)。 You can picture it laid out like 您可以像这样布置图片

+--------+
|        |
| Vector |
|        |
|        |
+--------+
     |
     |
     v
+-------------------------------------------------+
|         |         |         |         |         |
| Element | Element | Element | Element | Element |
|         |         |         |         |         |
+-------------------------------------------------+

So when you have a vector of things that have a vector in them, each Element becomes the vector and then those point of to their own storage somewhere else like 因此,当您拥有包含向量的事物的向量时,每个Element将成为向量,然后指向其他地方的自身存储

+--------+
|        |
| Vector |
|        |
|        |
+----+---+
     |
     |
     v
+----+----+---------+---------+
| Object  | Object  | Object  |
|  with   |  with   |  with   |
| Vector  | Vector  | Vector  |
+----+----+----+----+----+----+
     |         |         |   +---------+---------+---------+---------+---------+
     |         |         |   |         |         |         |         |         |
     |         |         +-->+ Element | Element | Element | Element | Element |
     |         |             |         |         |         |         |         |
     |         |             +-------------------------------------------------+
     |         |    +-------------------------------------------------+
     |         |    |         |         |         |         |         |
     |         +--->+ Element | Element | Element | Element | Element |
     |              |         |         |         |         |         |
     |              +-------------------------------------------------+
     |    +-------------------------------------------------+
     |    |         |         |         |         |         |
     +--->+ Element | Element | Element | Element | Element |
          |         |         |         |         |         |
          +---------+---------+---------+---------+---------+

This way all of the vectors are next to each other, but the elements the vectors have can be anywhere else in memory. 这样,所有向量都彼此相邻,但是向量具有的元素可以在内存中的其他任何位置。 It is for this reason you don't want to use a std:vector<std::vector<int>> for a matrix. 因此,您不想将std:vector<std::vector<int>>用于矩阵。 All of the sub vectors get memory to wherever so there is no locality between the rows. 所有子向量都将存储到任何地方,因此行之间没有局部性。


Do note that this applies to all of the allocator aware containers as they do not store the elements inside the container directly. 请注意,这适用于所有可识别分配器的容器,因为它们不会将元素直接存储在容器中。 This is not true for std::array as, like a raw array, the elements are part of the container. 对于std::array ,情况并非如此,因为像原始数组一样,元素是容器的一部分。 If you have an std::array<int, 20> then it is at least sizeof(int) * 20 bytes in size. 如果您有std::array<int, 20> ,则至少为sizeof(int) * 20个字节。

the size of 的大小

class Foo {

public:
    Foo() = default;
    Foo(std::vector<int> vec): _vec{vec} {}

private:
    std::vector<int> _vec;
};

is known and constant, the internal std::vector does the allocation in the heap, so there is no problem to do foovec.reserve(4); 内部常量std :: vector是已知且恒定的,它会在堆中进行分配,因此foovec.reserve(4);没问题foovec.reserve(4);

else how a std::vector can be in the stack ? 否则std :: vector怎么会在堆栈中? ;-) ;-)

The size of your class Foo is known at compile time, the std::vector class has a constant size, as the elements that it hold are allocated on the heap. Foo类的大小在编译时是已知的, std::vector类具有恒定的大小,因为它持有的元素是在堆上分配的。

std::vector<int> empty{};
std::vector<int> full{};
full.resize(1000000);
assert(sizeof(empty) == sizeof(full));

Both instances of std::vector<int> , empty and full will always have the same size despite holding a different number of elements. 尽管持有不同数量的元素, std::vector<int>两个实例, emptyfull都将始终具有相同的大小。

If you want an array which you can not resize, and it's size must be known at compile time, use std::array . 如果您想要一个无法调整大小的数组,并且必须在编译时知道其大小,请使用std::array

When you create a vector of type T and then reserve N elements for the vector, the compiler basically finds and reserves a contiguous block of memory 创建T类型的向量并为该向量保留N个元素时,编译器基本上会找到并保留一个连续的内存块

The compiler does no such thing. 编译器不执行此类操作。 It generates code to request storage from the vector's allocator at runtime . 在运行时生成代码以请求向量的分配器进行存储。 By default this is std::allocator , which delegates to operator new , which will fetch uninitialized storage from the runtime system. 默认情况下,它是std::allocator ,它委派给operator new ,该operator new将从运行时系统中获取未初始化的存储。

My mental model breaks down for custom objects of unknown/varying size 我的心理模型针对大小未知/变化的自定义对象分解

The only way a user-defined type can actually have unknown size is if it is incomplete - and you can't declare a vector to an incomplete type. 用户定义类型实际上可以具有未知大小的唯一方法是不完整-并且您不能将向量声明为不完整类型。

At any point in your code where the type is complete, its size is also fixed, and you can declare a vector storing that type as usual. 在你的代码中的任何点的种类齐全 ,其大小也是固定的,你可以声明存储该类型像往常一样的向量。


Your Foo is complete, and its size is fixed at compile time. 您的Foo已完成,并且其大小在编译时已固定。 You can check this with sizeof(Foo) , and sizeof(foovec[0]) etc. 您可以使用sizeof(Foo)sizeof(foovec[0])等进行检查。

The vector owns a variable amount of storage, but doesn't contain it in the object. 向量拥有可变数量的存储空间,但不包含在对象中。 It just stores a pointer and the reserved & used sizes (or something equivalent). 它只是存储一个指针以及保留和使用的大小(或等效值)。 For example, an instance of: 例如,以下实例:

class toyvec {
  int *begin_;
  int *end_;
  size_t capacity_;
public:
  // push_back, begin, end, and all other methods
};

always has fixed size sizeof(toyvec) = 2 * sizeof(int*) + sizeof(size_t) + maybe_some_padding . 始终具有固定的大小sizeof(toyvec) = 2 * sizeof(int*) + sizeof(size_t) + maybe_some_padding Allocating a huge block of memory, and setting begin to the start of it, has no effect on the size of the pointer itself. 分配一个巨大的内存块,并将设置begin ,对指针本身的大小没有影响。


tl;dr C++ does not have dynamically-resizing objects. tl; dr C ++没有动态调整大小的对象。 The size of an object is fixed permanently by the class definition. 对象的大小由类定义永久固定。 C++ does have objects which own - and may resize - dynamic storage, but that isn't part of the object itself. C ++ 确实拥有一些对象,这些对象拥有并且可能会调整动态存储的大小,但这不是对象本身的一部分。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM