简体   繁体   中英

does std::vector::insert allocate spare capacity?

Code like

std::vector<int> a;
for(size_t i = 0; i < n; ++i)
    a.push_back(0);

is guaranteed to run in linear time in n . This is achieved by allocating some additional spare capacity when reallocating (typically increasing the total capacity by a constant factor).

But what about std::vector::insert(pos, x) ? Ie is it guaranteed that

std::vector<int> a;
for(size_t i = 0; i < n; ++i)
    a.insert(a.end(), 0);

is linear as well? (similar question applies to insert(pos, first, last) of course)

The documentaiton of push_back is clear in guaranteeing "amortized constant" complexity.

But the documentation of insert says the complexity should be "constant plus linear in the distance between pos and end of the container". This can obviously only be true if no reallocation happens, thus my uncertainty.

EDIT: Summary of the answers: When a reallocation occurs, the implementation can either

  1. increase the capacity the bare minimum
  2. or increase the capacity more than the minimum, thus making future insertions faster.

In the case of push_back , option (2) is essentially required to achieve the "amortized constant" runtime. In the case of insert(pos, first, last) , option (2) is required in case of single-pass InputIterator (But in case of multi-pass ForwardIterator, the implementation can and does use a single reallocation with new_capacity = max(2*old_capacity, new_size) ). All other cases are implementation defined.

Testing with GCC 11 and clang 12, the situation seems to be:

  1. reserve and assign increase capacity by the bare minimum, and
  2. push_back and insert and resize increase the capacity by a constant factor

The actual language is

[ vector.overview ] A vector is a sequence container that supports (amortized) constant time insert and erase operations at the end; insert and erase in the middle take linear time

and more specifically

[ vector.modifiers 1 ] Complexity: If reallocation happens, linear in the number of elements of the resulting vector; otherwise, linear in the number of elements inserted plus the distance to the end of the vector.

but we both agree insert is broadly linear.

When reallocation happens, the options are:

  1. reuse the same internal mechanism for growing the vector as in push_back() :

    this is obviously still amortized constant time, so the linear-time element move still dominates

  2. grow the vector by just one element:

    since this is still linear time, it's still within the complexity allowed. However, it is actually more effort for the implementer, and objectively worse

I don't think I've ever seen an implementation that didn't just reuse some internal grow() method for both of these, but it would be technically legal to spend more effort doing something worse.

The exception to this reasoning is the range overload insert(iterator pos, InputIterator begin, InputIterator end) since, for a strict LegacyInputIterator , you can only do one pass.

If you can't pre-calculate the number of elements to grow by, any growth must be amortized constant time, or the overall complexity would be governed by N * distance(begin,end) , which could be O(N²) and thus non-conformant.


tl;dr

  • push_back must allocate extra capacity to remain amortized constant time
  • insert(pos, begin, end) must use amortized constant-time growth for each element of (begin,end] to remain amortized linear time overall
  • insert(pos, value) does not need to allocate extra capacity in order to meet its complexity requirement, but it's probably more effort for the implementer to get a worse result

"Amortized constant" means "constant most of the time". push_back will be O(1) for most of the calls. If reallocation is required, it will be linear, but this should happen relatively rarely.

If reallocation is required for insert , all elements will be moved, that's O(n). If reallocation is not required, all the elements to the right of newly inserted elements are moved, but that's still O(n).


Cppreference doesn't seem to mention reallocation in Complexity section reliably. For example,resize() mentions possible additional complexity in Complexity section, but push_back() andinsert() mention reallocation only in the function description. Might be worth to clean it up one day.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM