简体   繁体   English

c++ std::vector 是如何工作的?

[英]How does c++ std::vector work?

How does adding and removing elements "rescale" the data?添加和删​​除元素如何“重新调整”数据? How is the size of the vector calculated (I believe it is kept track of)?向量的大小是如何计算的(我相信它会被跟踪)? Any other additional resources to learn about vectors would be appreciated.任何其他了解向量的其他资源将不胜感激。

In terms of sizing, there are two values of interest for a std::vector : size , and capacity (accessed via .size() and .capacity() ).在施胶方面,也有用于感兴趣的两个值std::vectorsize ,和capacity (经由访问.size().capacity()

.size() is the number of elements that are contained in the vector, whereas .capacity() is the number of elements that can be added to the vector, before memory will be re-allocated. .size()是包含在矢量元素的数量,而.capacity()是可以被添加到矢量元素的数量,之前存储将被重新分配。

If you .push_back() an element, size will increase by one, up until you hit the capacity.如果你.push_back()一个元素,大小将增加一,直到你达到容量。 Once the capacity is reached, most (all?) implementations, re-allocate memory, doubling the capacity.一旦达到容量,大多数(所有?)实现会重新分配内存,将容量加倍。

You can reserve a capacity using .reserve() .您可以使用.reserve()预留容量。 For example:例如:

std::vector<int> A;
A.reserve(1);        // A: size:0, capacity:1  {[],x}
A.push_back(0);      // A: size:1, capacity:1  {[0]}
A.push_back(1);      // A: size:2, capacity:2  {[0,1]}
A.push_back(2);      // A: size:3, capacity:4  {[0,1,2],x}
A.push_back(3);      // A: size:4, capacity:4  {[0,1,2,3]}
A.push_back(4);      // A: size:5, capacity:8  {[0,1,2,3,4],x,x,x}

Reallocations of memory would occur at lines 4, 5, and 7.内存的重新分配将发生在第 4、5 和 7 行。

The vector usually has three pointers.该向量通常具有三个指针。 If the vector has never been used they are all 0, or NULL.如果从未使用过向量,则它们都是 0 或 NULL。

  • One to the first element of the vector.一个到向量的第一个元素。 (this is the begin() iterator) (这是 begin() 迭代器)
  • One to last element of the vector + 1. (this is the end() iterator)向量的最后一个元素 + 1。(这是 end() 迭代器)
  • And one more to the last allocated but unused element + 1. (this minus begin() is the capacity)再到最后一个已分配但未使用的元素 + 1。(这个减去 begin() 是容量)

When an element is inserted, the vector allocates some storage and sets its pointers.当插入一个元素时,向量分配一些存储空间并设置它的指针。 It might allocate 1 element, or it might allocate 4 elements.它可能分配 1 个元素,也可能分配 4 个元素。 Or 50.或 50。

Then it inserts the element and increments the last element pointer.然后它插入元素并递增最后一个元素指针。

When you insert more elements than are allocated the vector has to get more memory.当您插入的元素多于分配的元素时,向量必须获得更多的内存。 It goes out and gets some.它出去并得到一些。 If the memory location changes then it has to copy all the elements into the new space and free the old space.如果内存位置发生变化,那么它必须将所有元素复制到新空间并释放旧空间。

A common choice for resizing is to double the allocation every time it needs more memory.调整大小的常见选择是每次需要更多内存时将分配加倍。

The implementation of std::vector changed slightly with C++0x and later with the introduction of move semantics (see What are move semantics? for an introduction). std::vector的实现随着 C++0x 和后来的移动语义的引入而略有变化(请参阅什么是移动语义?有关介绍)。

When adding an element to a std::vector which is already full then the vector is resized which involves a procedure of allocating a new, larger memory area, moving the existing data to the new vector , deleting the old vector space, and then adding the new element.当向已经满的std::vector添加元素时,会调整vector的大小,这涉及分配新的更大内存区域、将现有数据移动到新vector 、删除旧vector空间,然后添加的过程新元素。

std::vector is a collection class in the Standard Template Library. std::vector是标准模板库中的一个集合类。 Putting objects into a vector , taking them out, or the vector performing a resize when an item is added to a full vector all require that the class of the object support an assignment operator, a copy constructor, and move semantics.把对象为vector ,把他们从,或vector ,当一个项目被添加到一个完整的进行大小调整vector都要求该对象的类支持的赋值操作符,一个拷贝构造函数,和移动语义。 (See type requirements for std::vector as well as std::vector works with classes that are not default constructible? for details.) (有关详细信息,请参阅std::vector 的类型要求以及std::vector 适用于不可默认构造的类?

One way to think of std::vector is as a C style array of contiguous elements of the type specified when the vector is defined that has some additional functionality to integrate it into the Standard Template Library offerings.一种将std::vector视为 C 风格的连续元素array ,该vector具有定义vector时指定的类型的连续元素,该array具有一些附加功能以将其集成到标准模板库产品中。 What separates a vector from a standard array is that a vector will dynamically grow as items are added. vector与标准array在于vector将随着项目的添加而动态增长。 (See std::vector and c-style arrays as well as When would you use an array rather than a vector/string? for some discussion about differences.) (有关差异的一些讨论,请参阅std::vector 和 c 样式数组以及何时使用数组而不是向量/字符串?

Using std::vector allows the use of other Standard Template Library components such as algorithms so using std::vector comes with quite a few advantages over a C style array as you get to use functionality that already exists.使用std::vector允许使用其他标准模板库组件,例如算法,因此当您开始使用已经存在的功能时,使用std::vector比 C 样式array具有相当多的优势。

You can specify an initial size if the maximum is known ahead of time.如果提前知道最大值,您可以指定初始大小。 (See Set both elements and initial capacity of std::vector as well as Choice between vector::resize() and vector::reserve() ) (见设置两个元件和std ::向量的初始容量以及矢量::调整大小()和矢量::储备之间选择() )

The basics of std::vector physical representation is of a set of pointers using memory allocated from the heap. std::vector物理表示的基础是一组使用从堆分配的内存的指针。 These pointers allow for the actual operations for accessing the elements stored in the vector , deleting elements from the vector , iterating over the vector , determining the number of elements, determining its size, etc.这些指针允许用于访问存储在所述元素的实际操作vector ,从删除元素vector ,遍历vector ,确定元素的数量,确定其大小等

Since the physical representation is contiguous memory, deleting items may result in moving of remaining items to close any holes created by the delete operation.由于物理表示是连续内存,删除项目可能会导致移动剩余项目以关闭删除操作产生的任何漏洞。

With modern C++ move semantics, the overhead of std::vector has been reduced such that it is typically the default container that would be used for most applications as recommended by Bjarne Stroustrup in his book The C++ Programming Language 4th Edition which discusses C++11.使用现代 C++ 移动语义, std::vector的开销已经减少,因此它通常是 Bjarne Stroustrup 在他的书 The C++ Programming Language 4th Edition 中推荐的用于大多数应用程序的默认容器,其中讨论了 C++ 11.

I think the basic idea of the std::vector can be understood with an example:我认为std::vector的基本思想可以通过一个例子来理解:

template<typename T>
class vector {

    T *storage;
    unsigned int length, cap;

    void resizeStorage() {
        int *copy = new T[cap];
        for (unsigned int i = 0 ; i < length, ++i) {
            copy[i] = storage[i]; 
        }
        delete [] storage;
        storage = copy
    }

    public:
    vector(unsigned int cap = 1): length(0), cap(cap), storage(new T[cap]) { 
        if (!cap)
            cap = 1;
    }

    unsigned int size() {
        return this.length;
    }

    unsigned int capacity() {
        return this.cap;
    }

    T& operator[](int index) {
        return storage[index];
    }

    const T& operator[](int index) const { 
        return storage[index];
    }

    void push_back(T element) {
        reserve(++length);
        storage[length] = element;
    }

    void reserve(int capacity) { 
        if(cap >= capacity) {
            return;
        }
        while(cap < capacity) { 
            cap *= 2;
        }
        resizeStorage();
    }

    virtual ~vector() { 
        delete[] storage;
    }

}

We need to reserve enough capacity for every push_back if the storage size is too small.如果存储空间太小,我们需要为每个push_back保留足够的容量。 We can reserve manually too, and clean up memory when we are done.我们也可以手动保留,完成后清理内存。 I've used a factor of 2 here, as this is generally how arrays are resized in memory (you could use 3 if you wanted to).我在这里使用了 2 的因子,因为这通常是数组在内存中调整大小的方式(如果愿意,可以使用 3)。

Note that the virtual destructor is generally considered best practice, and not strictly necessary here without inheritance involved.请注意,虚拟析构函数通常被认为是最佳实践,在不涉及继承的情况下,这里并不是绝对必要的。 Making the destructor non-virtual would potentially result in faster static binding.使析构函数非虚拟化可能会导致更快的静态绑定。 However, given that vectors can contain unknown objects it's highly likely vector has a virtual destructor to deallocate elements for a derived class.然而,考虑到载体可包含未知物体它极有可能vector有一个虚拟析构函数派生类解除分配元件。

The capacity parameter in the constructor also allows for more fine-grained control than versions prior to C++ 11, and in fact did not exist prior to C++ 11. However, I have included it as vector included it with Allocator s in C++ 14. I won't dive into the details ( Allocator simply is another template class to allocate the individual vector elements).构造函数中的容量参数还允许比 C++ 11 之前的版本进行更细粒度的控制,实际上在 C++ 11 之前不存在。但是,我将它作为vector包含在 C++ 14 中的Allocator中。我不会深入细节( Allocator只是另一个用于分配单个向量元素的模板类)。 The stdlib provides these high-level abstractions to standardise common operations with performance. stdlib 提供了这些高级抽象来标准化具有性能的常见操作。

There are numerous helper functions for vector such as swap , begin and end in the stdlib as well, however these just operate on the array store in a safe manner.在 stdlib 中还有许多用于vector辅助函数,例如swapbeginend ,但是这些只是以安全的方式在数组存储上运行。 The actual core vector implementation is based on the above (similar to C++ standards prior to 11), and additional logic is inferable.实际的核心vector实现是基于以上的(类似于11之前的C++标准),额外的逻辑是可以推断的。

The array copying logic could be done using array std::copy , and also we could have used smart pointers (specifically shared_ptr ) to clean-up implicitly.数组复制逻辑可以使用数组std::copy ,我们也可以使用智能指针(特别是shared_ptr )来隐式清理。 But instead, I chose to use low-level APIs though to demonstrate the logical steps.但相反,我选择使用低级 API 来演示逻辑步骤。

I wrote a vector in C++ a year or so ago.大约一年前,我用 C++ 编写了一个向量。 It is an array with a set size (ex. 16 chars) which is expanded by that amount when needed.它是一个具有固定大小(例如 16 个字符)的数组,可在需要时按该数量扩展。 That is to say, if the default size is 16 chars and you need to store Hi my name is Bobby , then it will double the size of the array to 32 chars then store the char array there.也就是说,如果默认大小是 16 个字符并且您需要存储Hi my name is Bobby ,那么它将数组的大小加倍到 32 个字符,然后将字符数组存储在那里。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM