简体   繁体   中英

What is the order of complexity of inserting N elements into a List<T> if I don't specify a capacity?

So let's say I have something like

List<int>() numbers = new List<int>();
int num;
while(int.TryParse(Console.ReadLine(), out num))
{
    numbers.Add(num);
}

and let's say the number of elements added is N . I'm wondering whether the total complexity would be described as O(N) or O(N^2) , when taking into consideration the fact that insertion is "usually" O(1) but will be O(n) "once in a while" when internally the list needs to be copied into a larger array.

.net uses a variant of the algorithm that allocates an array twice as large when the list is full, copies the existing elements to it and frees the old one, which is basically extending the initial array.

In fact, all languages that provide a List equivalent probably do this, since there is no reason not to.

and let's say the number of elements added is N. I'm wondering whether the total complexity would be described as O(N) or O(N^2), when taking into consideration the fact that insertion is "usually" O(1) but will be O(n) "once in a while" when internally the list needs to be copied into a larger array.

This is amortized O(n) for inserting n elements and amortized O(1) for inserting one element.

Amortized basically means that the total number of operations to insert n elements averages out to O(n) , even if one insert might do more operations than another, due to having to extend the array.

To see this, consider the classical algorithm I described in the first paragraph. Let's say our capacity is initially 1. We will count how many operations are performed when the array is extended. When inserting the first element, we have:

0

operations, because the array is not extended.

Inserting the second element has to copy the existing elements to the new array and only then perform the insertion. For clarity, we will ignore the constants introduced by memory operations. So this is:

1

operations (copy 1 element, new capacity is 2).

When inserting the third element, we have:

2 

operations (copy 2 elements, new capacity is 4)

When inserting the fourth element, we have 0 operations, because we still have room left.

When inserting the fifth, we have:

4

operations (copy 4 elements, new capacity is 8).

In general, when inserting the 2^k+1 th element, we will have:

2^k 

operations.

How large can k be? log base 2 of n ( log n ), because then we will have enough room for our n elements.

So the complexity of all resize operations is given by the sum:

S = 1 + 2 + 4 + ... + 2^k, k = log n
S = (1 - 2^k) / (1 - 2) // sum of a geometric progression with ratio 2
  = 2^k - 1
  = 2^(log n) - 1
  = n - 1
  = O(n)

So the total complexity is O(n) plus another O(n) since we didn't count the actual inserts. But this is still O(n) in total.

That depends entirely how this instance of List running on this instance of the Framework will grow. Something I can not predict, as the answer might change depending on OS resource, Framework version and how full the list is already. It is a implementation detail.

Obviously giving a capacity will improove performace. If you are not quite certain about the exact number, setting a excess capacity and then using TrimExcess() after all has been added might work. Capacity is only there to avoid the overhead of having to grow the list.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM