简体   繁体   English

IEnumerable 是怎么做的<T> .ToArray() 工作吗?

[英]How does IEnumerable<T>.ToArray() work?

Is it a two-pass algorithm?它是两遍算法吗? ie, it iterates the enumerable once to count the number of elements so that it can allocate the array, and then pass again to insert them?即,它迭代 enumerable 一次以计算元素的数量,以便它可以分配数组,然后再次传递以插入它们?

Does it loop once, and keep resizing the array?它是否循环一次,并不断调整数组大小?

Or does it use an intermediate structure like a List (which probably internally resizes an array)?或者它是否使用像 List 这样的中间结构(它可能在内部调整数组的大小)?

It uses an intermediate structure.它使用中间结构。 The actual type involved is a Buffer, which is an internal struct in the framework.实际涉及的类型是 Buffer,它是框架中的内部结构。 In practice, this type has an array, that is copied each time it is full to allocate more space.实际上,这种类型有一个数组,每次它满时都会复制它以分配更多空间。 This array starts with length of 4 (in .NET 4, it's an implementation detail that might change), so you might end up allocating and copying a lot when doing ToArray.此数组的长度为 4(在 .NET 4 中,这是一个可能会更改的实现细节),因此在执行 ToArray 时您可能最终会分配和复制很多内容。

There is an optimization in place, though.不过,有一个优化。 If the source implementes ICollection<T> , it uses Count from that to allocate the correct size of array from the start.如果源实现ICollection<T> ,它会使用 Count 从一开始就分配正确的数组大小。

First it checks to see if the source is an ICollection<T> , in which case it can call the source's ToArray() method.首先它检查源是否是ICollection<T> ,在这种情况下它可以调用源的ToArray()方法。

Otherwise, it enumerates the source exactly once.否则,它只枚举一次源。 As it enumerates it stores items into a buffer array.当它枚举时,它将项目存储到缓冲区数组中。 Whenever it hits the end of the buffer array it creates a new buffer of twice the size and copies in the old elements.每当它到达缓冲区数组的末尾时,它就会创建一个两倍大小的新缓冲区并复制旧元素。 Once the enumeration is finished it returns the buffer (if it's the exact right size) or copies the items from the buffer into an array of the exact right size.枚举完成后,它返回缓冲区(如果它的大小正好合适)或将项目从缓冲区复制到一个大小合适的数组中。

Here's pseudo-source code for the operation:这是该操作的伪源代码:

public static T[] ToArray<T>(this IEnumerable<T> source)
{
    T[] items = null;
    int count = 0;

    foreach (T item in source)
    {
        if (items == null)
        {
            items = new T[4];
        }
        else if (items.Length == count)
        {
            T[] destinationArray = new T[count * 2];
            Array.Copy(items, 0, destinationArray, 0, count);
            items = destinationArray;
        }
        items[count] = item;
        count++;
    }

    if (items.Length == count)
    {
        return items;
    }
    T[] destinationArray = new TElement[count];
    Array.Copy(items, 0, destinationArray, 0, count);
    return destinationArray;
}

Like this (via .NET Reflector):像这样(通过 .NET Reflector):

public static TSource[] ToArray<TSource>(this IEnumerable<TSource> source)
{
    if (source == null)
    {
        throw Error.ArgumentNull("source");
    }
    Buffer<TSource> buffer = new Buffer<TSource>(source);
    return buffer.ToArray();
}

[StructLayout(LayoutKind.Sequential)]
internal struct Buffer<TElement>
{
    internal TElement[] items;
    internal int count;
    internal Buffer(IEnumerable<TElement> source)
    {
        TElement[] array = null;
        int length = 0;
        ICollection<TElement> is2 = source as ICollection<TElement>;
        if (is2 != null)
        {
            length = is2.Count;
            if (length > 0)
            {
                array = new TElement[length];
                is2.CopyTo(array, 0);
            }
        }
        else
        {
            foreach (TElement local in source)
            {
                if (array == null)
                {
                    array = new TElement[4];
                }
                else if (array.Length == length)
                {
                    TElement[] destinationArray = new TElement[length * 2];
                    Array.Copy(array, 0, destinationArray, 0, length);
                    array = destinationArray;
                }
                array[length] = local;
                length++;
            }
        }
        this.items = array;
        this.count = length;
    }

    internal TElement[] ToArray()
    {
        if (this.count == 0)
        {
            return new TElement[0];
        }
        if (this.items.Length == this.count)
        {
            return this.items;
        }
        TElement[] destinationArray = new TElement[this.count];
        Array.Copy(this.items, 0, destinationArray, 0, this.count);
        return destinationArray;
    }
}

First, the items are loaded into an internal class Buffer<T> which allows the count to be generated首先,将项目加载到允许生成计数的内部类Buffer<T>

Next, Buffer<T>.ToArray is called, which does an Array.Copy of the Buffer<T> 's array into a returned array.接下来,调用Buffer<T>.ToArray ,它将Buffer<T>的数组的Array.Copy到返回的数组中。

.NET Reflector shows this code if you want to see for yourself.如果您想亲自查看,.NET Reflector 会显示此代码。

http://www.red-gate.com/products/reflector/ http://www.red-gate.com/products/reflector/

In general, attempting to iterate an enumerable twice can lead to a disaster as there is no guarantee that the enumerable can be iterated a second time.通常,尝试对可枚举项进行两次迭代可能会导致灾难,因为无法保证可以对可枚举项进行第二次迭代。 Therefore, performing a Count and then allocate then copy is out.因此,执行Count然后分配然后复制就行了。

In Reflector, it shows that it uses a type called Buffer that effectively streams the sequence into an array resizing (doubling on each reallocation so that the number of reallocations is O(log n) ) as needed and then returning an appropriately sized array when it reaches the end在 Reflector 中,它表明它使用一种称为Buffer的类型,该类型根据需要有效地将序列流式传输到调整大小的数组中(每次重新分配时加倍,以便重新分配的次数为O(log n) ),然后在需要时返回适当大小的数组到达终点

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM