简体   繁体   English

stl deque :: insert()的复杂性

[英]Complexity of stl deque::insert()

I learned the complexity of deque::insert() from the C++ standard 2003 (chapter 23.2.1.3) as follows: 我从C ++标准2003(第23.2.1.3章)中了解了deque::insert()的复杂性,如下所示:

In the worst case, inserting a single element into a deque takes time linear in the minimum of the distance from the insertion point to the beginning of the deque and the distance from the insertion point to the end of the deque. 在最坏的情况下,将单个元素插入到双端队列中需要时间在从插入点到双端队列开始的距离的最小值以及从插入点到双端队列结束的距离的线性。

I always understand the implementation of stl deque as a collection of memory chunks. 我总是将stl deque的实现理解为内存块的集合。 Hence an insertion will only affect the elements in the same memory chunk as the insertion position. 因此,插入仅影响与插入位置相同的存储块中的元素。 My question is, what does the standard mean by "linear in the minimum of the distance from the insertion point to the beginning of the deque and the distance from the insertion point to the end of the deque"? 我的问题是,标准是什么意思是“从插入点到双端队列开始的距离的最小值和从插入点到双端的终点的距离”是什么?

My understanding is because C++ standard does not enforce a certain implementation of deque. 我的理解是因为C ++标准没有强制实施deque的某种实现。 The complexity is just in general for the worst case. 对于最坏的情况,复杂性通常是一般的。 However, in the actual implementation in compilers, it is linear to the number of elements in a memory chunk, which may vary for different element sizes. 但是,在编译器的实际实现中,它与内存块中的元素数量成线性关系,这可能因不同的元素大小而异。

Another guess might be that, since insert() will invalidate all iterators, deque needs to update all iterators. 另一个猜测可能是,因为insert()将使所有迭代器无效,deque需要更新所有迭代器。 Therefore it's linear. 因此它是线性的。

std::deque is normally (always?) implemented as a collection of chunks of memory, but it won't normally insert a whole new chunk just for you to insert one new element in the middle of the collection. std :: deque通常(总是?)实现为内存块的集合, 它通常不会插入一个全新的块,只是为了在集合的中间插入一个新元素。 As such, it'll find whether the insertion point is closer to the beginning or end, and shuffle the existing elements to make room for the new element in an existing chunk. 因此,它将查找插入点是否更接近开头或结尾,并对现有元素进行随机播放,以便为现有块中的新元素腾出空间。 It'll only add a new chunk of memory at the beginning or end of the collection. 它只会在集合的开头或结尾添加一大块内存。

I think you'd be better served by a diagram... let's play with ASCII art! 我认为你的图表会更好...让我们玩ASCII艺术!

A deque is usually an array of memory chunks, but all apart the front and back memory chunks are full. deque通常是一组内存块,但前后内存块都是完整的。 This is necessary because a deque is a RandomAccessContainer, and to get O(1) access to any container, you cannot have an unbounded number of containers from which to read the size: 这是必要的,因为deque是RandomAccessContainer,并且为了获得对任何容器的O(1)访问,您不能拥有无限数量的容器来读取大小:

bool size() const { return first.size + (buckets.size()- 2) * SIZE + last.size; }

T& operator[](size_t i) {
  if (i < first.size) { return first[SIZE - i]; }

  size_t const correctedIndex = i - first.size;
  return buckets[correctedIndex / SIZE][correctedIndex % SIZE];
}

Those operations are O(1) because of the multiplication/division! 由于乘法/除法,这些操作是O(1)!

In my example, I'll suppose that a memory chunk is full when it contains 8 elements. 在我的例子中,我假设当一个内存块包含8个元素时它已满。 In practice, nobody said the size should be fixed, just that all inner buckets shall have the same size. 在实践中,没有人说尺寸应该固定,只是所有内桶应具有相同的尺寸。

 // Deque
 0:       ++
 1: ++++++++
 2: ++++++++
 3: ++++++++
 4: +++++

Now say that we want to insert at index 13. It falls somewhere in the bucket labelled 2. There are several strategies we can think about: 现在说我们想要在索引13处插入它。它位于标记为2的桶中。我们可以考虑几种策略:

  • extend bucket 2 (only) 扩展桶2(仅)
  • introduce a new bucket either before or after 2 and shuffle only a few elements 在2之前或之后引入一个新的桶,并且只需要洗几个元素

But those two strategies would violate the invariant that all "inner" buckets have the same number of elements. 但是这两种策略会违反所有“内部”桶具有相同数量元素的不变量。

Therefore we are left with shuffling the elements around, either toward the beginning or the end (whichever is cheaper), in our case: 因此,在我们的情况下,我们将左右的元素拖到开头或结尾(以较便宜的价格为准):

 // Deque
 0:      +++
 1: ++++++++
 2: +O++++++
 3: ++++++++
 4: +++++

Note how bucket 0 grew. 注意桶0如何增长。

This shuffle implies that, in the worst case, you'll move half the elements: O(N/2). 这种混洗意味着,在最坏的情况下,你将移动一半的元素:O(N / 2)。

deque has O(1) insert at either the beginning or the end though, because there it's just a matter of adding the element in the right place or (if the bucket is full) creating a new bucket. deque在开头或结尾插入O(1),因为只需在正确的位置添加元素或(如果存储桶已满)创建新存储桶。

There are other containers that have better insert/erase behavior at random indices, based on B+ Trees . 基于B +树 ,还有其他容器在随机索引处具有更好的插入/擦除行为。 In an indexed B+ Tree you can, instead of a "key" (or in parallel) maintain internally a count of the elements prior to a certain position. 在索引的B +树中,您可以而不是“键”(或并行)在内部维护某个位置之前的元素计数。 There are various technics to do this efficiently. 有效地执行此操作有各种技术。 At the end you get a container with: 最后你得到一个容器:

  • O(1): empty, size O(1):空,大小
  • O(log N): at, insert, erase O(log N):at,insert,erase

You can check the blist module in Python which implements a Python list-like element backed by such a structure. 您可以在Python中检查blist模块,该模块实现了由此类结构支持的类似Python列表的元素。

Your conjecture are ... 99.9% true. 你的推测是...... 99.9%是真的。 All depends on what the actual implementation is. 一切都取决于实际的实施。 What the standard specifies are the minimum requirement for both the implementors (that cannot claim to be standard if they don fit the specs) and users (that must not expect "better performances" if writing implementation independent code). 标准规定的是两个实现者的最低要求(如果他们不符合规范,则不能声称是标准的)和用户(如果编写与实现无关的代码则不能期望“更好的性能”)。

The idea behind the spec., is a chunk (a == one) of uninitialized memory where elements are allocated around the center... until there is space for them. 规范背后的想法是一个未初始化的内存块(一个==一个),其中元素分布在中心周围......直到它们有空间。 Inserting in the middle means shift. 插入中间意味着移位。 Inserting at front or end means just construct in place. 在前端或末端插入意味着只需构建到位。 (when no space exist, a reallocation is done) Indexes and iterators cannot be trusted after a modification, since we cannot assume what has been shifted and in which direction. (当不存在空间时,重新分配完成)修改后不能信任索引和迭代器,因为我们不能假设已经移位的内容和方向。

More efficient implementation don't use a single chunk, but multiple chunk to redistribute the "shifting" problem and to allocate memory in constant size from the underlying system (thus limiting reallocation and fragmentation). 更有效的实现不使用单个块,而是使用多个块来重新分配“移位”问题并从底层系统分配恒定大小的内存(从而限制重新分配和分段)。 If you're targeting one of them you can expect better performances, otherwise yo had better not to assume any structure optimization. 如果你的目标是其中一个,你可以期待更好的表现,否则你最好不要假设任何结构优化。

Linear on the number of elements inserted (copy construction). 线性插入的元素数量(复制构造)。 Plus, depending on the particular library implemention, additional linear time in up to the number of elements between position and one of the ends of the deque. 另外,根据特定的库实现,额外的线性时间可以达到位置和双端队列之一之间的元素数量。 Reference... 参考...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM