简体   繁体   English

你能用List吗? <List<struct> &gt;绕过2gb对象限制?

[英]Can you use List<List<struct>> to get around the 2gb object limit?

I'm running up against the 2gb object limit in c# (this applies even in 64 bit for some annoying reason) with a large collection of structs (est. size of 4.2 gig in total). 我正在运行c#中的2gb对象限制(由于某些令人讨厌的原因,这甚至适用于64位),其中包含大量结构(总共大小为4.2 gig)。

Now obviously using List is going to give me a list of size 4.2gb give or take, but would using a list made up of smaller lists, which in turn contain a portion of the structs, allow me to jump this limit? 现在显然使用List会给我一个大小为4.2gb给定或接受的列表,但是会使用由较小列表组成的列表,而这些列表又包含一部分结构,允许我跳过这个限制吗?

My reasoning here is that it's only a hard-coded limit in the CLR that stops me instantiating a 9gig object on my 64bit platform, and it's entirely unrelated to system resources. 我的理由是,它只是CLR中的硬编码限制,阻止我在64位平台上实例化一个9gig对象,而且它与系统资源完全无关。 Also Lists and Arrays are reference types, and so a List containing lists would only actually contain the references to each list. Lists和Arrays也是引用类型,因此包含列表的List实际上只包含对每个列表的引用。 No one object therefore exceeds the size limit. 因此,没有任何一个物体超过尺寸限制。

Is there any reason why this wouldn't work? 有什么理由不行吗? I'd try this myself right now but I don't have a memory profiler on hand to verify. 我现在自己尝试一下,但我手边没有内存分析器来验证。

Now obviously using List is going to give me a list of size 4.2gb give or take, but would using a list made up of smaller lists, which in turn contain a portion of the structs, allow me to jump this limit? 现在显然使用List会给我一个大小为4.2gb给定或接受的列表,但是会使用由较小列表组成的列表,而这些列表又包含一部分结构,允许我跳过这个限制吗?

Yes - though, if you're trying to work around this limit, I'd consider using arrays yourself instead of letting the List<T> class manage the array. 是的 - 但是,如果您正在尝试解决此限制,我会考虑自己使用数组,而不是让List<T>类管理数组。

The 2gb single object limit in the CLR is exactly that, a single object instance. CLR中的2gb单个对象限制正好是单个对象实例。 When you make an array of a struct (which List<T> uses internally), the entire array is "one object instance" in the CLR. 当你创建一个struct的数组( List<T>内部使用)时,整个数组是CLR中的“一个对象实例”。 However, by using a List<List<T>> or a jagged array, each internal list/array is a separate object, which allows you to effectively have any size object you wish. 但是,通过使用List<List<T>>或锯齿状数组,每个内部列表/数组都是一个单独的对象,它允许您有效地拥有所需的任何大小的对象。

The CLR team actually blogged about this, and provided a sample BigArray<T> implementation that acts like a single List<T> , but does the "block" management internally for you. CLR团队实际上在博客上写了这个,并提供了一个示例BigArray<T>实现,其作用类似于单个List<T> ,但在内部为您执行“阻止”管理。 This is another option for getting >2gb lists. 这是获得> 2gb列表的另一种选择。

Note that .NET 4.5 will have the option to provide larger than 2gb objects on x64 , but it will be something you have to explicitly opt in to having. 请注意,.NET 4.5可以选择在x64上提供大于2gb的对象 ,但是您必须明确选择使用它。

The List holds references which are 4 or 8 bytes, depending on if you're running in 32-bit or 64-bit mode, therefore if you reference a 2GB object that would not increase the actual List size to 2 GB but it would only increase it by the number of bytes it is necessary to reference that object. List包含4或8个字节的引用,具体取决于您是在32位还是64位模式下运行,因此如果您引用的2GB对象不会将实际的List大小增加到2 GB,但它只会通过引用该对象所需的字节数来增加它。

This will allow you to reference millions of objects and each object could be 2GB. 这将允许您引用数百万个对象,每个对象可以是2GB。 If you have 4 objects in the List and each is 2 GB, then you would have 8 GB worth of objects referenced by the List , but the List object would have only used up an extra 4*8=32 bytes. 如果你有在4名对象List ,且各自为2 GB,那么你将有8 GB值得引用对象的List ,但List对象将只使用一个额外的4×8 = 32个字节。

The number of references you can hold on a 32-bit machine before the List hits the 2GB limit is 536.87 million, on a 64-bit machine it's 268.43 million. List达到2GB限制之前,您可以在32位计算机上保留的引用数为536.87百万,而在64位计算机上则为268.43百万。

536 million references * 2 GB = A LOT OF DATA! 5.36亿个参考* 2 GB =很多数据!

PS Reed pointed out, the above is true for reference types but not for value types. PS Reed指出,上述内容适用于引用类型,但不适用于值类型。 So if you're holding value types, then your workaround is valid. 因此,如果您持有值类型,那么您的解决方法是有效的。 Please see the comment below for more info. 有关详细信息,请参阅下面的评论。

In versions of .NET prior to 4.5, the maximum object size is 2GB. 在4.5之前的.NET版本中,最大对象大小为2GB。 From 4.5 onwards you can allocate larger objects if gcAllowVeryLargeObjects is enabled. 从4.5开始,如果启用了gcAllowVeryLargeObjects,则可以分配更大的对象。 Note that the limit for string is not affected, but "arrays" should cover "lists" too, since lists are backed by arrays. 请注意, string的限制不受影响,但“数组”也应该涵盖“列表”,因为列表由数组支持。

class HugeList<T>
{
    private const int PAGE_SIZE = 102400;
    private const int ALLOC_STEP = 1024;

    private T[][] _rowIndexes;

    private int _currentPage = -1;
    private int _nextItemIndex = PAGE_SIZE;

    private int _pageCount = 0;
    private int _itemCount = 0;

    #region Internals

    private void AddPage()
    {
        if (++_currentPage == _pageCount)
            ExtendPages();

        _rowIndexes[_currentPage] = new T[PAGE_SIZE];
        _nextItemIndex = 0;
    }

    private void ExtendPages()
    {
        if (_rowIndexes == null)
        {
            _rowIndexes = new T[ALLOC_STEP][];
        }
        else
        {
            T[][] rowIndexes = new T[_rowIndexes.Length + ALLOC_STEP][];

            Array.Copy(_rowIndexes, rowIndexes, _rowIndexes.Length);

            _rowIndexes = rowIndexes;
        }

        _pageCount = _rowIndexes.Length;
    }

    #endregion Internals

    #region Public

    public int Count
    {
        get { return _itemCount; }
    }

    public void Add(T item)
    {
        if (_nextItemIndex == PAGE_SIZE)
            AddPage();

        _itemCount++;
        _rowIndexes[_currentPage][_nextItemIndex++] = item;
    }

    public T this[int index]
    {
        get { return _rowIndexes[index / PAGE_SIZE][index % PAGE_SIZE]; }
        set { _rowIndexes[index / PAGE_SIZE][index % PAGE_SIZE] = value; }
    }

    #endregion Public
}

There's an interesting post around this subject here: 这里有一篇关于这个主题的有趣帖子:

http://blogs.msdn.com/b/joshwil/archive/2005/08/10/450202.aspx http://blogs.msdn.com/b/joshwil/archive/2005/08/10/450202.aspx

Which talks about writing your own 'BigArray' object. 其中讨论了编写自己的“BigArray”对象。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM