简体   繁体   中英

Combine TakeWhile and SkipWhile to partition collection

I would like to partition collection on item, which matches specific condition. I can do that using TakeWhile and SkipWhile , which is pretty easy to understand:

public static bool IsNotSeparator(int value) => value != 3;

var collection = new [] { 1, 2, 3, 4, 5 };
var part1 = collection.TakeWhile(IsNotSeparator);
var part2 = collection.SkipWhile(IsNotSeparator);

But this would iterate from start of collection twice and if IsNotSeparator takes long it might be performance issue.

Faster way would be to use something like:

var part1 = new List<int>();
var index = 0;
for (var max = collection.Length; index < max; ++index) {
    if (IsNotSeparator(collection[i]))
        part1.Add(collection[i]);
    else
        break;
}
var part2 = collection.Skip(index);

But that's really less more readable than first example.

So my question is: what would be the best solution to partition collection on specific element?

What I though of combining those two above is:

var collection = new [] { 1, 2, 3, 4, 5 };
var part1 = collection.TakeWhile(IsNotSeparator).ToList();
var part2 = collection.Skip(part1.Count);

This is a quick example of how you would do the more general method (multiple splits, as mentioned in the comments), without LINQ (it's possible to convert it to LINQ, but I am not sure if it will be any more readable, and I am in a slight hurry right now):

public static IEnumerable<IEnumerable<T>> Split<T>(this IList<T> list, Predicate<T> match)
{
    if (list.Count == 0)
        yield break;

    var chunkStart = 0;
    for (int i = 1; i < list.Count; i++)
    {
        if (match(list[i]))
        {
            yield return new ListSegment<T>(list, chunkStart, i - 1);
            chunkStart = i;
        }
    }

    yield return new ListSegment<T>(list, chunkStart, list.Count - 1);
}

The code presumes a class named ListSegment<T> : IEnumerable<T> which simply iterates from from to to over the original list (no copying, similar to how ArraySegment<T> works (but is unfortunately limited to arrays).

So the code will return as many chunks as there are matches, ie this code:

var collection = new[] { "A", "B", "-", "C", "D", "-", "E" };
foreach (var chunk in collection.Split(i => i == "-"))
    Console.WriteLine(string.Join(", ", chunk));

would print:

A, B
-, C, D
-, E

How about using the Array Copy methods:

var separator = 3;
var collection = new [] { 1, 2, 3, 4, 5 };

var i = Array.IndexOf(collection,separator);

int[] part1 = new int[i];
int[] part2 = new int[collection.Length - i];
Array.Copy(collection, 0, part1, 0, i ); 
Array.Copy(collection, i, part2, 0, collection.Length - i ); 

Alternatively to be more efficient use ArraySegment:

var i = Array.IndexOf(collection,separator);
var part1 = new ArraySegment<int>( collection, 0, i );
var part2 = new ArraySegment<int>( collection, i, collection.Length - i );

ArraySegment is a wrapper around an array that delimits a range of elements in that array. Multiple ArraySegment instances can refer to the same original array and can overlap.

Edit - add combination of original question with ArraySegment so as not to iterate collection twice.

public static bool IsNotSeparator(int value) => value != 3;
var collection = new [] { 1, 2, 3, 4, 5 };

var index = collection.TakeWhile(IsNotSeparator).Count();

var part1 = new ArraySegment<int>( collection, 0, index );
var part2 = new ArraySegment<int>( collection, index, collection.Length - index );

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM