简体   繁体   English

python 数组时间复杂度?

[英]python array time complexity?

What's the .append time complexity of array.array and np.array ? array.arraynp.array.append时间复杂度是多少?

I see time complexity for list , collections.deque , set , and dict in python_wiki , but I can't find the time complexity of array.array and np.array .我在python_wiki中看到listcollections.dequesetdict的时间复杂度,但我找不到array.arraynp.array的时间复杂度。 Where can I find them?我在哪里可以找到它们?

So to link you provided (also a TLDR ) list are internally "represented as an array" link It's supposed to be O(1) with a note at the bottom saying:因此,链接您提供的(也是一个TLDR list在内部“表示为一个数组”链接它应该是 O(1),底部有一条注释:

"These operations rely on the "Amortized" part of "Amortized Worst Case". Individual actions may take surprisingly long, depending on the history of the container." “这些操作依赖于“摊销最坏情况”中的“摊销”部分。个别操作可能需要非常长的时间,具体取决于容器的历史记录。 link关联


More details更多细节

It doesn't go into detail in the docs but if you look at the source code you'll actually see what's going on.它没有 go 在文档中详细介绍,但是如果您查看源代码,您实际上会看到发生了什么。 Python array s have internal buffer(s) that allow for quick resizing of themselves and will realloc as it grows/shrinks. realloc array具有内部缓冲区,可以快速调整自身的大小,并在其增长/缩小时重新分配。

array.append uses arraymodule.array_array_append which calls arraymodule.ins calling arraymodule.ins1 which is the meat and potatoes of the operation. array.append使用arraymodule.array_array_append调用arraymodule.ins调用arraymodule.ins1是操作的肉和土豆。 Incidentally array.extend uses this as well but it just supplies it Py_SIZE(self) as the insertion index.顺便说一下, array.extend使用了它,但它只是提供Py_SIZE(self)作为插入索引。

So if we read the notes in arraymodule.ins1 it starts off with:因此,如果我们阅读arraymodule.ins1中的注释,它会从以下内容开始:

 Bypass realloc() when a previous overallocation is large enough to accommodate the newsize. If the newsize is 16 smaller than the current size, then proceed with the realloc() to shrink the array.

link 关联

... ...

 This over-allocates proportional to the array size, making room for additional growth. The over-allocation is mild, but is enough to give linear-time amortized behavior over a long sequence of appends() in the presence of a poorly-performing system realloc(). The growth pattern is: 0, 4, 8, 16, 25, 34, 46, 56, 67, 79, ... Note, the pattern starts out the same as for lists but then grows at a smaller rate so that larger arrays only overallocate by about 1/16th -- this is done because arrays are presumed to be more memory critical.

link 关联

It is important to understand the array data structure to answer your question.了解array数据结构以回答您的问题很重要。 Since both array objects are based on C arrays (regular , numpy ), they share a lot of the same functionality.由于两个array对象都基于 C arrays(常规numpy ),它们共享许多相同的功能。

Adding an item to an array is amortized O(1) , but in most cases, ends up being O(n) time.将一个项目添加到数组中的摊销O(1) ,但在大多数情况下,最终是O(n)时间。 This is because it could be the case that your array object is not filled yet, and thus appending some data to that spot in memory is a relatively trivial exercise, it is O(1) .这是因为您的数组 object 可能尚未填充,因此将一些数据附加到 memory 中的该位置是一个相对微不足道的练习,它是O(1) However, in most cases, the array is full and thus needs to be completely copied over in memory with the new item added to it.但是,在大多数情况下,数组已满,因此需要在 memory 中完全复制并添加新项目。 This is a very expensive operation since an array of n size needs to be copied, thus making the insertion O(n) .这是一个非常昂贵的操作,因为需要复制一个n大小的数组,从而进行插入O(n)

An interesting example from this post: 这篇文章中的一个有趣的例子:

To make this clearer, consider the case where the factor is 2 and initial array size is 1 .为了更清楚地说明这一点,请考虑因子为2且初始数组大小为1的情况。 Then consider copy costs to grow the array from size 1 to where it's large enough to hold, 2^k+1 elements for any k >= 0 .然后考虑复制成本以将数组从大小 1 增长到足够大的位置,任何k >= 0的 2^k+1 个元素。 This size is 2^(k+1) .这个大小是2^(k+1) Total copy costs will include all the copying to become that big in factor-of-2 steps:总复制成本将包括所有复制到 2 倍大的步骤:

1 + 2 + 4 + ... + 2^k = 2^(k+1) - 1 = 2n - 1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM