[英]python array time complexity?
What's the .append
time complexity of array.array
and np.array
? array.array
和np.array
的.append
时间复杂度是多少?
I see time complexity for list
, collections.deque
, set
, and dict
in python_wiki , but I can't find the time complexity of array.array
and np.array
.我在python_wiki中看到list
、 collections.deque
、 set
和dict
的时间复杂度,但我找不到array.array
和np.array
的时间复杂度。 Where can I find them?我在哪里可以找到它们?
So to link you provided (also a TLDR ) list
are internally "represented as an array" link It's supposed to be O(1) with a note at the bottom saying:因此,链接您提供的(也是一个TLDR ) list
在内部“表示为一个数组”链接它应该是 O(1),底部有一条注释:
"These operations rely on the "Amortized" part of "Amortized Worst Case". Individual actions may take surprisingly long, depending on the history of the container." “这些操作依赖于“摊销最坏情况”中的“摊销”部分。个别操作可能需要非常长的时间,具体取决于容器的历史记录。 link关联
More details更多细节
It doesn't go into detail in the docs but if you look at the source code you'll actually see what's going on.它没有 go 在文档中详细介绍,但是如果您查看源代码,您实际上会看到发生了什么。 Python array
s have internal buffer(s) that allow for quick resizing of themselves and will realloc
as it grows/shrinks. realloc
array
具有内部缓冲区,可以快速调整自身的大小,并在其增长/缩小时重新分配。
array.append
uses arraymodule.array_array_append
which calls arraymodule.ins
calling arraymodule.ins1
which is the meat and potatoes of the operation. array.append
使用arraymodule.array_array_append
调用arraymodule.ins
调用arraymodule.ins1
是操作的肉和土豆。 Incidentally array.extend
uses this as well but it just supplies it Py_SIZE(self)
as the insertion index.顺便说一下, array.extend
使用了它,但它只是提供Py_SIZE(self)
作为插入索引。
So if we read the notes in arraymodule.ins1
it starts off with:因此,如果我们阅读arraymodule.ins1
中的注释,它会从以下内容开始:
Bypass realloc() when a previous overallocation is large enough to accommodate the newsize. If the newsize is 16 smaller than the current size, then proceed with the realloc() to shrink the array.
... ...
This over-allocates proportional to the array size, making room for additional growth. The over-allocation is mild, but is enough to give linear-time amortized behavior over a long sequence of appends() in the presence of a poorly-performing system realloc(). The growth pattern is: 0, 4, 8, 16, 25, 34, 46, 56, 67, 79, ... Note, the pattern starts out the same as for lists but then grows at a smaller rate so that larger arrays only overallocate by about 1/16th -- this is done because arrays are presumed to be more memory critical.
It is important to understand the array
data structure to answer your question.了解array
数据结构以回答您的问题很重要。 Since both array
objects are based on C arrays (regular , numpy ), they share a lot of the same functionality.由于两个array
对象都基于 C arrays(常规, numpy ),它们共享许多相同的功能。
Adding an item to an array is amortized O(1)
, but in most cases, ends up being O(n)
time.将一个项目添加到数组中的摊销为O(1)
,但在大多数情况下,最终是O(n)
时间。 This is because it could be the case that your array object is not filled yet, and thus appending some data to that spot in memory is a relatively trivial exercise, it is O(1)
.这是因为您的数组 object 可能尚未填充,因此将一些数据附加到 memory 中的该位置是一个相对微不足道的练习,它是O(1)
。 However, in most cases, the array is full and thus needs to be completely copied over in memory with the new item added to it.但是,在大多数情况下,数组已满,因此需要在 memory 中完全复制并添加新项目。 This is a very expensive operation since an array of n
size needs to be copied, thus making the insertion O(n)
.这是一个非常昂贵的操作,因为需要复制一个n
大小的数组,从而进行插入O(n)
。
An interesting example from this post: 这篇文章中的一个有趣的例子:
To make this clearer, consider the case where the factor is
2
and initial array size is1
.为了更清楚地说明这一点,请考虑因子为2
且初始数组大小为1
的情况。 Then consider copy costs to grow the array from size 1 to where it's large enough to hold, 2^k+1 elements for anyk >= 0
.然后考虑复制成本以将数组从大小 1 增长到足够大的位置,任何k >= 0
的 2^k+1 个元素。 This size is2^(k+1)
.这个大小是2^(k+1)
。 Total copy costs will include all the copying to become that big in factor-of-2 steps:总复制成本将包括所有复制到 2 倍大的步骤:
1 + 2 + 4 + ... + 2^k = 2^(k+1) - 1 = 2n - 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.