简体   繁体   中英

Fastest way to chop array in two pieces

I have an array, say:

var arr1 = new [] { 1, 2, 3, 4, 5, 6 };

Now, when my array-size exceeds 5, I want to resize the current array to 3, and create a new array that contains the upper 3 values, so after this action:

arr1 = new [] { 1, 2, 3 };
newArr = new [] { 4, 5, 6 };

What's the fastest way to do this? I guess I'll have to look into the unmanaged corner, but no clue.


Some more info:

  • The arrays have to be able to size up without large performance hits
  • The arrays will only contain Int32's
  • Purpose of the array is to group the numbers in my source array without having to sort the whole list

In short: I want to split the following input array:

int[] arr = new int[] { 1, 3, 4, 29, 31, 33, 35, 36, 37 };

into

arr1 =  1, 3, 4
arr2 =  29, 31, 33, 35, 36, 37

but because the ideal speed is reached with an array size of 3, arr2 should be split into 2 evenly sized arrays.

Note

I know that an array's implementation in memory is quite naive (well, at least it is in C, where you can manipulate the count of items in the array so the array resizes). Also that there is a memory move function somewhere in the Win32 API. So I guess this would be the fastest:

  1. Change arr1 so it only contains 3 items
  2. Create new array arr2 with size 3
  3. Memmove the bytes that aren't in arr1 anymore into arr2

I'm not sure there's anything better than creating the empty arrays, and then using Array.Copy . I'd at least hope that's optimized internally :)

int[] firstChunk = new int[3];
int[] secondChunk = new int[3];
Array.Copy(arr1, 0, firstChunk, 0, 3);
Array.Copy(arr1, 3, secondChunk, 0, 3);

To be honest, for very small arrays the overhead of the method call may be greater than just explicitly assigning the elements - but I assume that in reality you'll be using slightly bigger ones :)

You might also consider not actually splitting the array, but instead using ArraySegment to have separate "chunks" of the array. Or perhaps use List<T> to start with... it's hard to know without a bit more context.

If speed is really critical, then unmanaged code using pointers may well be the fastest approach - but I would definitely check whether you really need to go there before venturing into unsafe code.

Are you looking for something like this?

static unsafe void DoIt(int* ptr)
{
    Console.WriteLine(ptr[0]);
    Console.WriteLine(ptr[1]);
    Console.WriteLine(ptr[2]);
}

static unsafe void Main()
{
    var bytes = new byte[1024];
    new Random().NextBytes(bytes);

    fixed (byte* p = bytes)
    {
        for (int i = 0; i < bytes.Length; i += sizeof(int))
        {
            DoIt((int*)(p + i));
        }
    }

    Console.ReadKey();
}

This avoids creating new arrays (which cannot be resized, not even with unsafe code!) entirely and just passes a pointer into the array to some method which reads the first three integers.

If your array will always contain 6 items how about:

var newarr1 = new []{oldarr[0], oldarr[1],oldarr[2]};
var newarr2 = new []{oldarr[3], oldarr[4],oldarr[5]};

Reading from memory is fast.

Since arrays are not dynamically resized in C#, this means your first array must have a minimum length of 5 or maximum length of 6, depending on your implementation. Then, you're going to have to dynamically create new statically sized arrays of 3 each time you need to split. Only after each split will your array items be in their natural order unless you make each new array a length of 5 or 6 as well and only add to the most recent. This approach means that each new array will have 2-3 extra pointers as well.

Unless you have a known number of items to go into your array BEFORE compiling the application, you're also going to have to have some form of holder for your dynamically created arrays, meaning you're going to have to have an array of arrays (a jagged array). Since your jagged array is also statically sized, you'll need to be able to dynamically recreate and resize it as each new dynamically created array is instantiated.

I'd say copying the items into the new array is the least of your worries here. You're looking at some pretty big performance hits as well as the array size(s) grow.


UPDATE: So, if this were absolutely required of me...

public class MyArrayClass
{
    private int[][] _master = new int[10][];
    private int[] _current = new int[3];
    private int _currentCount, _masterCount;

    public void Add(int number)
    {
        _current[_currentCount] = number;
        _currentCount += 1;
        if (_currentCount == _current.Length)
        {
            Array.Copy(_current,0,_master[_masterCount],0,3);
            _currentCount = 0;
            _current = new int[3];
            _masterCount += 1;
            if (_masterCount == _master.Length)
            {
                int[][] newMaster = new int[_master.Length + 10][];
                Array.Copy(_master, 0, newMaster, 0, _master.Length);
                _master = newMaster;
            }
        }
    }

    public int[][] GetMyArray()
    {
        return _master;
    }

    public int[] GetMinorArray(int index)
    {
        return _master[index];
    }

    public int GetItem(int MasterIndex, int MinorIndex)
    {
        return _master[MasterIndex][MinorIndex];
    }
}

Note: This probably isn't perfect code, it's a horrible way to implement things, and I would NEVER do this in production code.

The obligatory LINQ solution:

if(arr1.Length > 5)
{
   var newArr = arr1.Skip(arr1.Length / 2).ToArray();
   arr1 = arr1.Take(arr1.Length / 2).ToArray();
}

LINQ is faster than you might think; this will basically be limited by the Framework's ability to spin through an IEnumerable (which on an array is pretty darn fast). This should execute in roughly linear time, and can accept any initial size of arr1.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM