[英]How does c++ std::vector work?
How does adding and removing elements "rescale" the data?添加和删除元素如何“重新调整”数据? How is the size of the vector calculated (I believe it is kept track of)?
向量的大小是如何计算的(我相信它会被跟踪)? Any other additional resources to learn about vectors would be appreciated.
任何其他了解向量的其他资源将不胜感激。
In terms of sizing, there are two values of interest for a std::vector
: size
, and capacity
(accessed via .size()
and .capacity()
).在施胶方面,也有用于感兴趣的两个值
std::vector
: size
,和capacity
(经由访问.size()
和.capacity()
.size()
is the number of elements that are contained in the vector, whereas .capacity()
is the number of elements that can be added to the vector, before memory will be re-allocated. .size()
是包含在矢量元素的数量,而.capacity()
是可以被添加到矢量元素的数量,之前存储将被重新分配。
If you .push_back()
an element, size will increase by one, up until you hit the capacity.如果你
.push_back()
一个元素,大小将增加一,直到你达到容量。 Once the capacity is reached, most (all?) implementations, re-allocate memory, doubling the capacity.一旦达到容量,大多数(所有?)实现会重新分配内存,将容量加倍。
You can reserve a capacity using .reserve()
.您可以使用
.reserve()
预留容量。 For example:例如:
std::vector<int> A;
A.reserve(1); // A: size:0, capacity:1 {[],x}
A.push_back(0); // A: size:1, capacity:1 {[0]}
A.push_back(1); // A: size:2, capacity:2 {[0,1]}
A.push_back(2); // A: size:3, capacity:4 {[0,1,2],x}
A.push_back(3); // A: size:4, capacity:4 {[0,1,2,3]}
A.push_back(4); // A: size:5, capacity:8 {[0,1,2,3,4],x,x,x}
Reallocations of memory would occur at lines 4, 5, and 7.内存的重新分配将发生在第 4、5 和 7 行。
The vector usually has three pointers.该向量通常具有三个指针。 If the vector has never been used they are all 0, or NULL.
如果从未使用过向量,则它们都是 0 或 NULL。
When an element is inserted, the vector allocates some storage and sets its pointers.当插入一个元素时,向量分配一些存储空间并设置它的指针。 It might allocate 1 element, or it might allocate 4 elements.
它可能分配 1 个元素,也可能分配 4 个元素。 Or 50.
或 50。
Then it inserts the element and increments the last element pointer.然后它插入元素并递增最后一个元素指针。
When you insert more elements than are allocated the vector has to get more memory.当您插入的元素多于分配的元素时,向量必须获得更多的内存。 It goes out and gets some.
它出去并得到一些。 If the memory location changes then it has to copy all the elements into the new space and free the old space.
如果内存位置发生变化,那么它必须将所有元素复制到新空间并释放旧空间。
A common choice for resizing is to double the allocation every time it needs more memory.调整大小的常见选择是每次需要更多内存时将分配加倍。
The implementation of std::vector
changed slightly with C++0x and later with the introduction of move semantics (see What are move semantics? for an introduction). std::vector
的实现随着 C++0x 和后来的移动语义的引入而略有变化(请参阅什么是移动语义?有关介绍)。
When adding an element to a std::vector
which is already full then the vector
is resized which involves a procedure of allocating a new, larger memory area, moving the existing data to the new vector
, deleting the old vector
space, and then adding the new element.当向已经满的
std::vector
添加元素时,会调整vector
的大小,这涉及分配新的更大内存区域、将现有数据移动到新vector
、删除旧vector
空间,然后添加的过程新元素。
std::vector
is a collection class in the Standard Template Library. std::vector
是标准模板库中的一个集合类。 Putting objects into a vector
, taking them out, or the vector
performing a resize when an item is added to a full vector
all require that the class of the object support an assignment operator, a copy constructor, and move semantics.把对象为
vector
,把他们从,或vector
,当一个项目被添加到一个完整的进行大小调整vector
都要求该对象的类支持的赋值操作符,一个拷贝构造函数,和移动语义。 (See type requirements for std::vector as well as std::vector works with classes that are not default constructible? for details.) (有关详细信息,请参阅std::vector 的类型要求以及std::vector 适用于不可默认构造的类? )
One way to think of std::vector
is as a C style array
of contiguous elements of the type specified when the vector
is defined that has some additional functionality to integrate it into the Standard Template Library offerings.一种将
std::vector
视为 C 风格的连续元素array
,该vector
具有定义vector
时指定的类型的连续元素,该array
具有一些附加功能以将其集成到标准模板库产品中。 What separates a vector
from a standard array
is that a vector
will dynamically grow as items are added. vector
与标准array
在于vector
将随着项目的添加而动态增长。 (See std::vector and c-style arrays as well as When would you use an array rather than a vector/string? for some discussion about differences.) (有关差异的一些讨论,请参阅std::vector 和 c 样式数组以及何时使用数组而不是向量/字符串? )
Using std::vector
allows the use of other Standard Template Library components such as algorithms so using std::vector
comes with quite a few advantages over a C style array
as you get to use functionality that already exists.使用
std::vector
允许使用其他标准模板库组件,例如算法,因此当您开始使用已经存在的功能时,使用std::vector
比 C 样式array
具有相当多的优势。
You can specify an initial size if the maximum is known ahead of time.如果提前知道最大值,您可以指定初始大小。 (See Set both elements and initial capacity of std::vector as well as Choice between vector::resize() and vector::reserve() )
(见设置两个元件和std ::向量的初始容量以及矢量::调整大小()和矢量::储备之间选择() )
The basics of std::vector
physical representation is of a set of pointers using memory allocated from the heap. std::vector
物理表示的基础是一组使用从堆分配的内存的指针。 These pointers allow for the actual operations for accessing the elements stored in the vector
, deleting elements from the vector
, iterating over the vector
, determining the number of elements, determining its size, etc.这些指针允许用于访问存储在所述元素的实际操作
vector
,从删除元素vector
,遍历vector
,确定元素的数量,确定其大小等
Since the physical representation is contiguous memory, deleting items may result in moving of remaining items to close any holes created by the delete operation.由于物理表示是连续内存,删除项目可能会导致移动剩余项目以关闭删除操作产生的任何漏洞。
With modern C++ move semantics, the overhead of std::vector
has been reduced such that it is typically the default container that would be used for most applications as recommended by Bjarne Stroustrup in his book The C++ Programming Language 4th Edition which discusses C++11.使用现代 C++ 移动语义,
std::vector
的开销已经减少,因此它通常是 Bjarne Stroustrup 在他的书 The C++ Programming Language 4th Edition 中推荐的用于大多数应用程序的默认容器,其中讨论了 C++ 11.
I think the basic idea of the std::vector
can be understood with an example:我认为
std::vector
的基本思想可以通过一个例子来理解:
template<typename T>
class vector {
T *storage;
unsigned int length, cap;
void resizeStorage() {
int *copy = new T[cap];
for (unsigned int i = 0 ; i < length, ++i) {
copy[i] = storage[i];
}
delete [] storage;
storage = copy
}
public:
vector(unsigned int cap = 1): length(0), cap(cap), storage(new T[cap]) {
if (!cap)
cap = 1;
}
unsigned int size() {
return this.length;
}
unsigned int capacity() {
return this.cap;
}
T& operator[](int index) {
return storage[index];
}
const T& operator[](int index) const {
return storage[index];
}
void push_back(T element) {
reserve(++length);
storage[length] = element;
}
void reserve(int capacity) {
if(cap >= capacity) {
return;
}
while(cap < capacity) {
cap *= 2;
}
resizeStorage();
}
virtual ~vector() {
delete[] storage;
}
}
We need to reserve enough capacity for every push_back
if the storage size is too small.如果存储空间太小,我们需要为每个
push_back
保留足够的容量。 We can reserve manually too, and clean up memory when we are done.我们也可以手动保留,完成后清理内存。 I've used a factor of 2 here, as this is generally how arrays are resized in memory (you could use 3 if you wanted to).
我在这里使用了 2 的因子,因为这通常是数组在内存中调整大小的方式(如果愿意,可以使用 3)。
Note that the virtual destructor is generally considered best practice, and not strictly necessary here without inheritance involved.请注意,虚拟析构函数通常被认为是最佳实践,在不涉及继承的情况下,这里并不是绝对必要的。 Making the destructor non-virtual would potentially result in faster static binding.
使析构函数非虚拟化可能会导致更快的静态绑定。 However, given that vectors can contain unknown objects it's highly likely
vector
has a virtual destructor to deallocate elements for a derived class.然而,考虑到载体可包含未知物体它极有可能
vector
有一个虚拟析构函数派生类解除分配元件。
The capacity parameter in the constructor also allows for more fine-grained control than versions prior to C++ 11, and in fact did not exist prior to C++ 11. However, I have included it as vector
included it with Allocator
s in C++ 14. I won't dive into the details ( Allocator
simply is another template class to allocate the individual vector elements).构造函数中的容量参数还允许比 C++ 11 之前的版本进行更细粒度的控制,实际上在 C++ 11 之前不存在。但是,我将它作为
vector
包含在 C++ 14 中的Allocator
中。我不会深入细节( Allocator
只是另一个用于分配单个向量元素的模板类)。 The stdlib provides these high-level abstractions to standardise common operations with performance. stdlib 提供了这些高级抽象来标准化具有性能的常见操作。
There are numerous helper functions for vector
such as swap
, begin
and end
in the stdlib as well, however these just operate on the array store in a safe manner.在 stdlib 中还有许多用于
vector
辅助函数,例如swap
、 begin
和end
,但是这些只是以安全的方式在数组存储上运行。 The actual core vector
implementation is based on the above (similar to C++ standards prior to 11), and additional logic is inferable.实际的核心
vector
实现是基于以上的(类似于11之前的C++标准),额外的逻辑是可以推断的。
The array copying logic could be done using array std::copy
, and also we could have used smart pointers (specifically shared_ptr
) to clean-up implicitly.数组复制逻辑可以使用数组
std::copy
,我们也可以使用智能指针(特别是shared_ptr
)来隐式清理。 But instead, I chose to use low-level APIs though to demonstrate the logical steps.但相反,我选择使用低级 API 来演示逻辑步骤。
I wrote a vector in C++ a year or so ago.大约一年前,我用 C++ 编写了一个向量。 It is an array with a set size (ex. 16 chars) which is expanded by that amount when needed.
它是一个具有固定大小(例如 16 个字符)的数组,可在需要时按该数量扩展。 That is to say, if the default size is 16 chars and you need to store
Hi my name is Bobby
, then it will double the size of the array to 32 chars then store the char array there.也就是说,如果默认大小是 16 个字符并且您需要存储
Hi my name is Bobby
,那么它将数组的大小加倍到 32 个字符,然后将字符数组存储在那里。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.