简体   繁体   中英

Can you use List<List<struct>> to get around the 2gb object limit?

I'm running up against the 2gb object limit in c# (this applies even in 64 bit for some annoying reason) with a large collection of structs (est. size of 4.2 gig in total).

Now obviously using List is going to give me a list of size 4.2gb give or take, but would using a list made up of smaller lists, which in turn contain a portion of the structs, allow me to jump this limit?

My reasoning here is that it's only a hard-coded limit in the CLR that stops me instantiating a 9gig object on my 64bit platform, and it's entirely unrelated to system resources. Also Lists and Arrays are reference types, and so a List containing lists would only actually contain the references to each list. No one object therefore exceeds the size limit.

Is there any reason why this wouldn't work? I'd try this myself right now but I don't have a memory profiler on hand to verify.

Now obviously using List is going to give me a list of size 4.2gb give or take, but would using a list made up of smaller lists, which in turn contain a portion of the structs, allow me to jump this limit?

Yes - though, if you're trying to work around this limit, I'd consider using arrays yourself instead of letting the List<T> class manage the array.

The 2gb single object limit in the CLR is exactly that, a single object instance. When you make an array of a struct (which List<T> uses internally), the entire array is "one object instance" in the CLR. However, by using a List<List<T>> or a jagged array, each internal list/array is a separate object, which allows you to effectively have any size object you wish.

The CLR team actually blogged about this, and provided a sample BigArray<T> implementation that acts like a single List<T> , but does the "block" management internally for you. This is another option for getting >2gb lists.

Note that .NET 4.5 will have the option to provide larger than 2gb objects on x64 , but it will be something you have to explicitly opt in to having.

The List holds references which are 4 or 8 bytes, depending on if you're running in 32-bit or 64-bit mode, therefore if you reference a 2GB object that would not increase the actual List size to 2 GB but it would only increase it by the number of bytes it is necessary to reference that object.

This will allow you to reference millions of objects and each object could be 2GB. If you have 4 objects in the List and each is 2 GB, then you would have 8 GB worth of objects referenced by the List , but the List object would have only used up an extra 4*8=32 bytes.

The number of references you can hold on a 32-bit machine before the List hits the 2GB limit is 536.87 million, on a 64-bit machine it's 268.43 million.

536 million references * 2 GB = A LOT OF DATA!

PS Reed pointed out, the above is true for reference types but not for value types. So if you're holding value types, then your workaround is valid. Please see the comment below for more info.

In versions of .NET prior to 4.5, the maximum object size is 2GB. From 4.5 onwards you can allocate larger objects if gcAllowVeryLargeObjects is enabled. Note that the limit for string is not affected, but "arrays" should cover "lists" too, since lists are backed by arrays.

class HugeList<T>
{
    private const int PAGE_SIZE = 102400;
    private const int ALLOC_STEP = 1024;

    private T[][] _rowIndexes;

    private int _currentPage = -1;
    private int _nextItemIndex = PAGE_SIZE;

    private int _pageCount = 0;
    private int _itemCount = 0;

    #region Internals

    private void AddPage()
    {
        if (++_currentPage == _pageCount)
            ExtendPages();

        _rowIndexes[_currentPage] = new T[PAGE_SIZE];
        _nextItemIndex = 0;
    }

    private void ExtendPages()
    {
        if (_rowIndexes == null)
        {
            _rowIndexes = new T[ALLOC_STEP][];
        }
        else
        {
            T[][] rowIndexes = new T[_rowIndexes.Length + ALLOC_STEP][];

            Array.Copy(_rowIndexes, rowIndexes, _rowIndexes.Length);

            _rowIndexes = rowIndexes;
        }

        _pageCount = _rowIndexes.Length;
    }

    #endregion Internals

    #region Public

    public int Count
    {
        get { return _itemCount; }
    }

    public void Add(T item)
    {
        if (_nextItemIndex == PAGE_SIZE)
            AddPage();

        _itemCount++;
        _rowIndexes[_currentPage][_nextItemIndex++] = item;
    }

    public T this[int index]
    {
        get { return _rowIndexes[index / PAGE_SIZE][index % PAGE_SIZE]; }
        set { _rowIndexes[index / PAGE_SIZE][index % PAGE_SIZE] = value; }
    }

    #endregion Public
}

There's an interesting post around this subject here:

http://blogs.msdn.com/b/joshwil/archive/2005/08/10/450202.aspx

Which talks about writing your own 'BigArray' object.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM