[英]How does the array.array insertion work under the hood in python?
In python, array.array is a mutable structure.在 python 中,array.array 是一个可变结构。
However, I am not sure how the insertion operation works in array.array structure.但是,我不确定插入操作在 array.array 结构中是如何工作的。
Since array.array uses a contiguous memory, does it create a new memory block and copy all the elements of the array if the new element cannot be placed in the contiguous manner?由于 array.array 使用连续的 memory,如果新元素不能以连续的方式放置,它是否会创建一个新的 memory 块并复制数组的所有元素? Does it reserve additional unused space just in case for insertion operations?
它是否保留额外的未使用空间以防插入操作?
Listing [Python 3.Docs]: array - Efficient arrays of numeric values just in case.清单[Python 3.Docs]:数组 - 高效的 arrays 数值以防万一。
Any decent container that under the hood keeps the data in a contiguous memory zone, allocates more memory than required to hold the current number of elements.任何在后台将数据保存在连续 memory 区域中的体面容器,分配的 memory 比保存当前元素数量所需的要多。
What would happen if there would be no room for additional elements when inserting (appending is a particular case) an element:如果在插入(附加是一种特殊情况)元素时没有额外元素的空间会发生什么:
As seen, when appending (which is the most common operation to add elements), there would be a lot of work which takes time and resources ( CPU power, memory).正如所见,在追加(这是添加元素的最常见操作)时,会有很多工作需要时间和资源( CPU能力、内存)。
Modern containers have a growing policy algorithm: every time that the memory zone needs to be reallocated (container is getting full), a number N of elements is added to the existing size to compute the new size, and more: N gets bigger every time such a reallocation takes place .现代容器有一个不断增长的策略算法:每次需要重新分配 memory 区域(容器已满)时,都会向现有大小添加N个元素以计算新大小,并且更多: N每次都变大这样的重新分配发生了。 That is to minimize the (expensive) memory operations.
这是为了最小化(昂贵的)memory 操作。
Of course, at the other end of the interval would be the possibility to allocate a huge amount of memory (eg 500 MiB ) for a container, but that wouldn't be feasible, as a lot of memory would just be "sitting" there as reserved in case the owning container might need it.当然,在间隔的另一端可能会为容器分配大量 memory (例如500 MiB ),但这不可行,因为很多 memory 只是“坐”在那里保留以防拥有的容器可能需要它。
After all, it's just a matter of compromise.毕竟,这只是一个妥协的问题。
You can check [CPPReference]: std::vector as an example ( size and capacity methods).您可以检查[CPRPeference]: std::vector作为示例(大小和容量方法)。
Back to our problem: array.array
is indeed a modern container that does allocate unused space.回到我们的问题:
array.array
确实是一个分配未使用空间的现代容器。 From [GitHub]: python/cpython - (master) cpython/Modules/arraymodule.c :来自[GitHub]:python/cpython - (master) cpython/Modules/arraymodule.c :
/* This over-allocates proportional to the array size, making room * for additional growth. The over-allocation is mild, but is * enough to give linear-time amortized behavior over a long * sequence of appends() in the presence of a poorly-performing * system realloc(). * The growth pattern is: 0, 4, 8, 16, 25, 34, 46, 56, 67, 79, ... * Note, the pattern starts out the same as for lists but then * grows at a smaller rate so that larger arrays only overallocate * by about 1/16th -- this is done because arrays are presumed to be more * memory critical. */
As for the insertion algorithm itself, check the ins1 function:至于插入算法本身,查看ins1 function:
As a side note, other Python containers use this technique, check [SO]: Why does list ask about __len__?作为旁注,其他Python容器使用此技术,请检查[SO]:为什么列表会询问 __len__? (@CristiFati's answer) for more details.
(@CristiFati 的回答)了解更多详情。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.