简体   繁体   中英

how to process “parallel” sequences in Linq?

Suppose I have 2 enumerations that I know have the same number of elements and each element "corresponds" with the identically placed element in the other enumeration. Is there a way to process these 2 enumerations simultaneously so that I have access to the corresponding elements of each enumeration at the same time?

Using a theoretical LINQ syntax, what I have in mind is something like:

from x in seq1, y in seq2
select new {x.foo, y.bar}

The function you are looking for is called "Zip". It works like a zipper. It'll be in .NET 4.0 iirc. In the meantime you may want to look at the BclExtras library. (Man, I'm a real advocate for this lib, lol).

IEnumerable<Tuple<TSeq1, TSeq2>> tuples = from t in seq1.Zip(seq2)
                                          select t;

If you just want to get done, you'll have to get both sequences enumerator and run them "in parallel" using a traditional loop.

Since Neil Williams deleted his answer, I'll go ahead and post a link to an implementation by Jon Skeet .

To paraphrase the relevant portion:

public static IEnumerable<KeyValuePair<TFirst,TSecond>> Zip<TFirst,TSecond>
    (this IEnumerable<TFirst> source, IEnumerable<TSecond> secondSequence)
{
    using (IEnumerator<TSecond> secondIter = secondSequence.GetEnumerator())
    {
        foreach (TFirst first in source)
        {
            if (!secondIter.MoveNext())
            {
                throw new ArgumentException
                    ("First sequence longer than second");
            }
            yield return new KeyValuePair<TFirst, TSecond>(first, secondIter.Current);
        }
        if (secondIter.MoveNext())
        {
            throw new ArgumentException
                ("Second sequence longer than first");
        }
    }        
}

Note that the KeyValuePair<> is my addition, and that I'm normally not a fan of using it this way. Instead, I would define a generic Pair or Tuple type. However, they are not included in the current version of the framework and I didn't want to clutter this sample with extra class definitions.

There's a "Zip" method being added in 4.0 that addresses this (like a zipper, zipping up adjacent elements.) Until then, the most readable (albeit not most performant) way would probably be something like this, unless lazy evaluation is really crucial:

var indexedA = seqA.ToArray();
var indexedB = seqB.ToArray();

for(int i = 0; i < indexedA.Length && i < indexedB.Length; i++)
{
    var thisA = indexedA[i];
    var thisB = indexedB[i];
    // whatever
}

Update:

Eric Lippert recently posted on this: http://blogs.msdn.com/ericlippert/archive/2009/05/07/zip-me-up.aspx

It's especially interesting because he's posted the source for the new extension in C#4:

public static IEnumerable<TResult> Zip<TFirst, TSecond, TResult>
    (this IEnumerable<TFirst> first, 
    IEnumerable<TSecond> second, 
    Func<TFirst, TSecond, TResult> resultSelector) 
{
    if (first == null) throw new ArgumentNullException("first");
    if (second == null) throw new ArgumentNullException("second");
    if (resultSelector == null) throw new ArgumentNullException("resultSelector");
    return ZipIterator(first, second, resultSelector);
}

private static IEnumerable<TResult> ZipIterator<TFirst, TSecond, TResult>
    (IEnumerable<TFirst> first, 
    IEnumerable<TSecond> second, 
    Func<TFirst, TSecond, TResult> resultSelector) 
{
    using (IEnumerator<TFirst> e1 = first.GetEnumerator())
        using (IEnumerator<TSecond> e2 = second.GetEnumerator())
            while (e1.MoveNext() && e2.MoveNext())
                yield return resultSelector(e1.Current, e2.Current);
}

Original answer:

Are you referring to a join?

from x in seq1
join y in seq2
on x.foo equals y.foo
select new {x, y}

There is also pLinq - which executes linq statements in parallel (across multiple threads).


Edit:

Ah - thanks for clarifying the question, though I really don't think my answer deserved a vote down.

It sounds like what you want is something like:

from x in seq1
join y in seq2
on x.Index equals y.Index
select new {x.Foo, y.Bar}

Unfortunately you can't do that with Linq - it extends IEnumerable , which only really has current and next properties, so no index property.

Obviously you can do this easily in C# with a nested for-loop and an if block, but you can't with Linq I'm afraid.

The only way to mimic this in linq syntax is to artificially add the index:

int counter = 0;
var indexed1 = (
    from x in seq1
    select { item = x, index = counter++ } ).ToList();
//note the .ToList forces execution, this won't work if lazy

counter = 0;
var indexed2 = (
    from x in seq2
    select { item = x, index = counter++ } ).ToList();

var result = 
    from x in indexed1 
    join y in indexed2
    on x.index = y.index
    select new {x.item.Foo, y.item.Bar}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM