简体   繁体   English

在 python 中,array.array 插入是如何工作的?

[英]How does the array.array insertion work under the hood in python?

In python, array.array is a mutable structure.在 python 中,array.array 是一个可变结构。

However, I am not sure how the insertion operation works in array.array structure.但是,我不确定插入操作在 array.array 结构中是如何工作的。

Since array.array uses a contiguous memory, does it create a new memory block and copy all the elements of the array if the new element cannot be placed in the contiguous manner?由于 array.array 使用连续的 memory,如果新元素不能以连续的方式放置,它是否会创建一个新的 memory 块并复制数组的所有元素? Does it reserve additional unused space just in case for insertion operations?它是否保留额外的未使用空间以防插入操作?

Listing [Python 3.Docs]: array - Efficient arrays of numeric values just in case.清单[Python 3.Docs]:数组 - 高效的 arrays 数值以防万一。

Any decent container that under the hood keeps the data in a contiguous memory zone, allocates more memory than required to hold the current number of elements.任何在后台将数据保存在连续 memory 区域中的体面容器,分配的 memory 比保存当前元素数量所需的要多。
What would happen if there would be no room for additional elements when inserting (appending is a particular case) an element:如果在插入(附加是一种特殊情况)元素时没有额外元素的空间会发生什么:

  1. Allocate memory area (current size + one (element size))分配memory区域(当前大小+一(元素大小))
  2. Copy data to the new area将数据复制到新区域
  3. Free old area免费旧区
  4. Additional small operations done anyway (like size (counters) update, ...)无论如何都完成了额外的小操作(如大小(计数器)更新,...)

As seen, when appending (which is the most common operation to add elements), there would be a lot of work which takes time and resources ( CPU power, memory).正如所见,在追加(这是添加元素的最常见操作)时,会有很多工作需要时间和资源( CPU能力、内存)。

Modern containers have a growing policy algorithm: every time that the memory zone needs to be reallocated (container is getting full), a number N of elements is added to the existing size to compute the new size, and more: N gets bigger every time such a reallocation takes place .现代容器有一个不断增长的策略算法:每次需要重新分配 memory 区域(容器已满)时,都会向现有大小添加N个元素以计算新大小,并且更多: N每次都变大这样的重新分配发生了 That is to minimize the (expensive) memory operations.这是为了最小化(昂贵的)memory 操作。
Of course, at the other end of the interval would be the possibility to allocate a huge amount of memory (eg 500 MiB ) for a container, but that wouldn't be feasible, as a lot of memory would just be "sitting" there as reserved in case the owning container might need it.当然,在间隔的另一端可能会为容器分配大量 memory (例如500 MiB ),但这不可行,因为很多 memory 只是“坐”在那里保留以防拥有的容器可能需要它。
After all, it's just a matter of compromise.毕竟,这只是一个妥协的问题。

You can check [CPPReference]: std::vector as an example ( size and capacity methods).您可以检查[CPRPeference]: std::vector作为示例(大小容量方法)。

Back to our problem: array.array is indeed a modern container that does allocate unused space.回到我们的问题: array.array确实是一个分配未使用空间的现代容器。 From [GitHub]: python/cpython - (master) cpython/Modules/arraymodule.c :来自[GitHub]:python/cpython - (master) cpython/Modules/arraymodule.c

 /* This over-allocates proportional to the array size, making room * for additional growth. The over-allocation is mild, but is * enough to give linear-time amortized behavior over a long * sequence of appends() in the presence of a poorly-performing * system realloc(). * The growth pattern is: 0, 4, 8, 16, 25, 34, 46, 56, 67, 79, ... * Note, the pattern starts out the same as for lists but then * grows at a smaller rate so that larger arrays only overallocate * by about 1/16th -- this is done because arrays are presumed to be more * memory critical. */

As for the insertion algorithm itself, check the ins1 function:至于插入算法本身,查看ins1 function:

  • Size is checked (and updated), and if needed the memory is increased检查(并更新)大小,如果需要,增加 memory
  • Elements following the insertion position are shifted towards the end ("right") with one position插入 position 后的元素用一个 position 向末端(“右”)移动
  • New element is placed at the insertion position新元素放置在插入 position

As a side note, other Python containers use this technique, check [SO]: Why does list ask about __len__?作为旁注,其他Python容器使用此技术,请检查[SO]:为什么列表会询问 __len__? (@CristiFati's answer) for more details. (@CristiFati 的回答)了解更多详情。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM