简体   繁体   中英

std::vector pop_back() implementation

I have just started with my Data Structures and I am implementing some sort of arrayed list (~=std::vector). I was wondering how is pop_back() implemented so that it has constant complexity? It has to reduce the size by 1 and delete the last element. Now I saw somewhere that it basically reduces the size by 1 and destroys last element via some iterator/pointer, but then why couldn't we implement pop_front() the same way and just redirect our pointer to first element to the next one?

template <typename T>
ArrayList<T>& ArrayList<T>::pop_back()
{
    --size_;
    auto p = new T[size_];
    std::copy(elements_,elements_+size_,p);
    delete [] elements_;   //elements_ being my pointer
    elements_ = p;
    return *this;
}

That's not how pop_back() is typically implemented. While it is an implementation detail, typically your list/vector/dynamic array would keep track of two sizes: one the actual size of the list and the other the capacity. The capacity is the size of the underlying allocated memory. Now pop_back() simply decreases size by 1 and runs destructor on the last element (do not confuse the destruction, ie call to ~T() method with delete operator). It doesn't relocate anything. Capacity stays the same. The entire operation does not depend on the size of the list (unlike your implementation).

Note that you cannot do the same with pop_front() in an easy way. You would have to track both the begining and the end of the list plus the underlying memory (and depending on approach: either store size and capacity or calculate them at runtime). Which requires more memory and potentially more cpu operations. And also such structure becomes weird. You know capacity, you know size, but you actually don't know how many push_back() you can do before resize happens (you only know that the number is bounded by "capacity minus size", but it can be smaller). Unless you add this information to your structure, which again either eats memory or cpu.

Side note: If you are going to take raw destructors way, then do not use delete[] operator at all. The delete operation is pretty much "call destructors + deallocate memory". And so if you manually destruct, then an additional call to delete[] will lead to the undefined behaviour. Unless you actually allocate char memory (regardless of T ) and use placement new (which also requires some manual size and offset calculations). And this is a good way of implementing such vector, although extra care is required (manual memory tracking is ab***h).

The reason you're having trouble implementing pop_back() with O(1) complexity is that by using new[] and delete[] , you have tied together the lifetime of your contained objects with the storage of your objects. What I mean by that is that when you create a raw dynamically allocated array using new T[n]; , two things happen: 1) storage is allocated and 2) objects are constructed in that storage. Conversely, delete[] will first destroy all objects (call their destructors) and then release the underlying memory back to the system.

C++ does provide ways to separately deal with storage and lifetimes. The main things involved here are raw storage, placement new , pointer casts, and headaches.

In your pop_back() function, it appears you want to be able to destroy just one object in the array without destroying all other objects or releasing their storage. Unfortunately, this is not possible using new[] and delete[] . The way that std::vector and some other containers work around this is by using lower-level language features. Typically, this is done by allocating a contiguous region of raw memory (typed as unsigned char or std::byte , or using helpers like std::aligned_storage ), doing a ton of book-keeping and safety checks and extra work, and then using placement new to construct an object in that raw memory. To access the object, you would compute offsets into the (array of) raw memory and use a reinterpret_cast to yield a pointer to the object you've placed there. This also requires explicitly calling the destructor of the object. Working at this low level, literally every detail of the object's lifetime is in your hands and it is very tedious and error-prone. I do not recommend doing it. But it is possible and it allows std::vector::pop_back() to be implemented in O(1) complexity (and without invalidating iterators to previous elements).

When you pop_back on a standard vector you don't actually release the associated memory. Only the last element is destroyed, the size is reduced but capacity is not. No other element is copied or moved. This is hard to replicate with custom containers because you can't delete or destroy single element of an array.

For this reason std::vector<T> doesn't actually use an array of T . It allocate raw uninitialized memory (something like std::aligned_storage ) and performs placement new to create elements in that allocation as needed. Placement new is a version of new that does not allocate but instead is given a pointer where it should construct the object. This means that the lifetime of the object is not directly associated with it's allocation and uniquely allows you to destroy elements individually without freeing it's underlying memory by calling it's destructor. Internally, it would look something like pop_back() { back().~T(); --size; } pop_back() { back().~T(); --size; } pop_back() { back().~T(); --size; } .

I was wondering how is pop_back() implemented so that it has constant complexity? It has to reduce the size by 1 and delete the last element.

Exactly. That's all it has to do. No matter how many elements you have in total.

That's the definition of constant complexity.

Now I saw somewhere that it basically reduces the size by 1 and destroys last element via some iterator/pointer

That's right. The memory is still allocated, but the object living there undergoes logical destruction.

but then why couldn't we implement pop_front() the same way and just redirect our pointer to first element to the next one?

You could! That's how std::deque works, which is why std::deque has a pop_front() with constant complexity.

However, you can't do it while maintaining other cheap operations, because it necessarily causes memory fragmentation after a few times. Memory is allocated in chunks, and vectors need to be allocated in contiguous chunks. Imagine that vectors just "ignored" the first element if you pop_front() 'd it — now do that five hundred times. You have unused memory just sitting there forever. Not good! And what if you want to add to the front now? Eventually, unless you want to end up using infinite memory, you'd have to split your memory up into distinct chunks, but that breaks the contiguity guaranteed.

Other containers are designed to give you what you want, but with trade-offs. In particular, std::deque cannot guarantee you contiguous storage. That's how it prevents leaving loads of unused memory lying around all the time.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM