简体   繁体   English

将N个元素插入到列表中的复杂性顺序是什么 <T> 如果我不指定容量?

[英]What is the order of complexity of inserting N elements into a List<T> if I don't specify a capacity?

So let's say I have something like 假设我有类似的东西

List<int>() numbers = new List<int>();
int num;
while(int.TryParse(Console.ReadLine(), out num))
{
    numbers.Add(num);
}

and let's say the number of elements added is N . 假设添加的元素数为N I'm wondering whether the total complexity would be described as O(N) or O(N^2) , when taking into consideration the fact that insertion is "usually" O(1) but will be O(n) "once in a while" when internally the list needs to be copied into a larger array. 考虑到插入通常是O(1)而是O(n)的事实,我想知道总复杂度是描述为O(N)还是O(N^2) 。一会儿”内部需要将列表复制到更大的数组中。

.net uses a variant of the algorithm that allocates an array twice as large when the list is full, copies the existing elements to it and frees the old one, which is basically extending the initial array. .net使用一种算法的变体,当列表已满时,该数组将分配一个两倍大的数组,将现有元素复制到其中,并释放旧元素,从而基本上扩展了初始数组。

In fact, all languages that provide a List equivalent probably do this, since there is no reason not to. 实际上,所有提供列表等效项的语言都可能这样做,因为没有理由不这样做。

and let's say the number of elements added is N. I'm wondering whether the total complexity would be described as O(N) or O(N^2), when taking into consideration the fact that insertion is "usually" O(1) but will be O(n) "once in a while" when internally the list needs to be copied into a larger array. 并假设添加的元素数为N。我想知道当考虑到插入通常是O(1)时,总复杂度是O(N)还是O(N ^ 2)。 ),但当内部需要将列表复制到更大的数组中时,将“偶尔”显示O(n)。

This is amortized O(n) for inserting n elements and amortized O(1) for inserting one element. 这是用于插入n元素的摊销O(n) ,是用于插入一个元素的摊销O(1)

Amortized basically means that the total number of operations to insert n elements averages out to O(n) , even if one insert might do more operations than another, due to having to extend the array. 基本上,摊销意味着插入n元素的操作总数平均为O(n) ,即使由于插入必须扩展数组,一次插入操作可能比另一次执行更多的操作。

To see this, consider the classical algorithm I described in the first paragraph. 要看到这一点,请考虑我在第一段中描述的经典算法。 Let's say our capacity is initially 1. We will count how many operations are performed when the array is extended. 假设我们的容量最初为1。我们将计算扩展数组时执行的操作数。 When inserting the first element, we have: 插入第一个元素时,我们有:

0

operations, because the array is not extended. 操作,因为数组未扩展。

Inserting the second element has to copy the existing elements to the new array and only then perform the insertion. 插入第二个元素必须将现有元素复制到新数组,然后才执行插入。 For clarity, we will ignore the constants introduced by memory operations. 为了清楚起见,我们将忽略内存操作引入的常量。 So this is: 因此,这是:

1

operations (copy 1 element, new capacity is 2). 操作(复制1个元素,新容量为2)。

When inserting the third element, we have: 当插入第三个元素时,我们有:

2 

operations (copy 2 elements, new capacity is 4) 操作(复制2个元素,新容量为4)

When inserting the fourth element, we have 0 operations, because we still have room left. 插入第四个元素时,我们有0操作,因为我们还有剩余空间。

When inserting the fifth, we have: 当插入第五个时,我们有:

4

operations (copy 4 elements, new capacity is 8). 操作(复制4个元素,新容量为8)。

In general, when inserting the 2^k+1 th element, we will have: 通常,当插入第2^k+1个元素时,我们将具有:

2^k 

operations. 操作。

How large can k be? k可以是多少? log base 2 of n ( log n ), because then we will have enough room for our n elements. log base 2 of nlog n ),因为这样我们将有足够的空间容纳n元素。

So the complexity of all resize operations is given by the sum: 因此,所有调整大小操作的复杂度由总和给出:

S = 1 + 2 + 4 + ... + 2^k, k = log n
S = (1 - 2^k) / (1 - 2) // sum of a geometric progression with ratio 2
  = 2^k - 1
  = 2^(log n) - 1
  = n - 1
  = O(n)

So the total complexity is O(n) plus another O(n) since we didn't count the actual inserts. 因此总复杂度为O(n)加上另一个O(n)因为我们没有计算实际的插入数。 But this is still O(n) in total. 但这仍然是O(n)

That depends entirely how this instance of List running on this instance of the Framework will grow. 这完全取决于在此框架实例上运行的List实例将如何增长。 Something I can not predict, as the answer might change depending on OS resource, Framework version and how full the list is already. 我无法预料的是,答案可能会因操作系统资源,框架版本和列表的完整程度而有所不同。 It is a implementation detail. 这是一个实现细节。

Obviously giving a capacity will improove performace. 显然,提供能力会提高性能。 If you are not quite certain about the exact number, setting a excess capacity and then using TrimExcess() after all has been added might work. 如果您不确定确切的数目,则可以设置多余的容量,然后在全部添加之后使用TrimExcess() Capacity is only there to avoid the overhead of having to grow the list. 容量只是为了避免必须增加列表的开销。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM