I have a few AsyncEnumerable<string>
s that I would like to merge in a single AsyncEnumerable<string>
, which should contain all the elements that are emitted concurrently from those sequences. So I used the Merge
operator from the System.Interactive.Async package. The problem is that this operator does not always treat all sequences as equal. In some circumstances it prefers emitting elements from the sequences that are on the left side of the arguments list, and neglects the sequences that are on the right side in the arguments list. Here is a minimal example that reproduces this undesirable behavior:
var sequence_A = Enumerable.Range(1, 5).Select(i => $"A{i}").ToAsyncEnumerable();
var sequence_B = Enumerable.Range(1, 5).Select(i => $"B{i}").ToAsyncEnumerable();
var sequence_C = Enumerable.Range(1, 5).Select(i => $"C{i}").ToAsyncEnumerable();
var merged = AsyncEnumerableEx.Merge(sequence_A, sequence_B, sequence_C);
await foreach (var item in merged) Console.WriteLine(item);
This code snippet has also a dependency on the System.Linq.Async package. The sequence_A
emits 5 elements starting from "A"
, the sequence_B
emits 5 elements starting from "B"
, and the sequence_C
emits 5 elements starting from "C"
.
Output (undesirable):
A1
A2
A3
A4
A5
B1
B2
B3
B4
B5
C1
C2
C3
C4
C5
The desirable output should look like this:
A1
B1
C1
A2
B2
C2
A3
B3
C3
A4
B4
C4
A5
B5
C5
In case all sequences have their next element available, the merged sequence should pull one element from each sequence, instead of pulling elements repeatedly from the left-most sequence.
How can I ensure that my sequences are merged with fairness? I am looking for a combination of operators from the official packages that has the desirable behavior, or for a custom Merge
operator that does what I want.
Clarification: I am interested about the concurrent Merge
functionality, where all source sequences are observed at the same time, and any emission from any of the sequences is propagated to the merged sequence. The concept of fairness applies when more than one sequences can emit an element immediately, in which case their emissions should be interleaved. In the opposite case, when there is no element immediately available, the rule is "first to come - first to go".
Update: Here is a more realistic demo, that includes latency in the producer sequences, and in the consuming enumeration loop. It simulates a situation where consuming the values produced by the left-most sequence takes longer than the time required for producing those values.
var sequence_A = Produce("A", 200, 1, 2, 3, 4, 5);
var sequence_B = Produce("B", 150, 1, 2, 3, 4, 5);
var sequence_C = Produce("C", 100, 1, 2, 3, 4, 5);
var merged = AsyncEnumerableEx.Merge(sequence_A, sequence_B, sequence_C);
await foreach (var item in merged)
{
Console.WriteLine(item);
await Task.Delay(item.StartsWith("A") ? 300 : 50); // Latency
}
async IAsyncEnumerable<string> Produce(string prefix, int delay, params int[] values)
{
foreach (var value in values)
{
var delayTask = Task.Delay(delay);
yield return $"{prefix}{value}";
await delayTask; // Latency
}
}
The result is an undesirable bias for the values produced by the sequence_A
:
A1
A2
A3
A4
A5
B1
B2
C1
B3
C2
B4
C3
C4
B5
C5
Here is the final code. The algorithm has been modified to suit the OP. I have left the original code below.
This use a greedy algorithm: the first available value is returned, and no attempt is made to merge in turn. Each time a task finishes, the next one for the same enumerator goes to the back, ensuring fairness.
The algorithm is as follows:
params
array of sources. MoveNextAsync
and store the pair in the list.Task.WhenAny
on the whole list.Task
and find its location in the list.true
, then yield
the value and call MoveNextAsync
again for the matching enumerator, pushing the resulting tuple to the back of the list.false
, then Dispose
the enumerator.finally
block disposes any remaining enumerators. There are some efficiencies to be had in terms of allocations etc. I've left that as an exercise to the reader.
public static IAsyncEnumerable<T> Interleave<T>(params IAsyncEnumerable<T>[] sources) =>
Interleave(default, sources);
public static async IAsyncEnumerable<T> Interleave<T>([EnumeratorCancellation] CancellationToken token, IAsyncEnumerable<T>[] sources)
{
if(sources.Length == 0)
yield break;
var enumerators = new List<(IAsyncEnumerator<T> e, Task<bool> t)>(sources.Length);
try
{
for(var i = 0; i < sources.Length; i++)
{
var e = sources[i].GetAsyncEnumerator(token);
enumerators.Add((e, e.MoveNextAsync().AsTask()));
}
do
{
var taskResult = await Task.WhenAny(enumerators.Select(tuple => tuple.t));
var ind = enumerators.FindIndex(tuple => tuple.t == taskResult);
var tuple = enumerators[ind];
enumerators.RemoveAt(ind);
if(taskResult.Result)
{
yield return tuple.e.Current;
enumerators.Add((tuple.e, tuple.e.MoveNextAsync().AsTask()));
}
else
{
try
{
await tuple.e.DisposeAsync();
}
catch
{ //
}
}
} while (enumerators.Count > 0);
}
finally
{
for(var i = 0; i < enumerators.Count; i++)
{
try
{
await enumerators[i].e.DisposeAsync();
}
catch
{ //
}
}
}
}
EDIT The below isn't quite what OP wanted, as OP wants any result to be returned, whichever first. I'll leave this here because it's a good demonstration of this algorithm.
Here is a full implementation of the async Interleave
or Merge
algorithm , known more commonly in SQL terms as a Merge-Concatenation .
The algorithm is as follows:
params
array of sources. MoveNextAsync
.true
, then yield
the value and increment the loop counter. If it rolls over, go back to the beginning.false
, then Dispose
it and remove from the list. Do not increment counter.finally
block disposes any remaining enumerators. public static IAsyncEnumerable<T> Interleave<T>(params IAsyncEnumerable<T>[] sources) =>
Interleave(default, sources);
public static async IAsyncEnumerable<T> Interleave<T>([EnumeratorCancellation] CancellationToken token, IAsyncEnumerable<T>[] sources)
{
if(sources.Length == 0)
yield break;
var enumerators = new List<IAsyncEnumerator<T>>(sources.Length);
try
{
for(var i = 0; i < sources.Length; i++)
enumerators.Add(sources[i].GetAsyncEnumerator(token));
var j = 0;
do
{
if(await enumerators[j].MoveNextAsync())
{
yield return enumerators[j].Current;
j++;
if(j >= enumerators.Count)
j = 0;
}
else
{
try
{
await enumerators[j].DisposeAsync();
}
catch
{ //
}
enumerators.RemoveAt(j);
}
} while (enumerators.Count > 0);
}
finally
{
for(var i = 0; i < enumerators.Count; i++)
{
try
{
await enumerators[i].DisposeAsync();
}
catch
{ //
}
}
}
}
This can obviously be significantly simplified if you only have a fixed number of source enumerables.
The example is a bit contrived as all results are available immediately. If even a small delay is added, the results are mixed:
var sequence_A = AsyncEnumerable.Range(1, 5)
.SelectAwait(async i =>{ await Task.Delay(i); return $"A{i}";});
var sequence_B = AsyncEnumerable.Range(1, 5)
.SelectAwait(async i =>{ await Task.Delay(i); return $"B{i}";});
var sequence_C = AsyncEnumerable.Range(1, 5)
.SelectAwait(async i =>{ await Task.Delay(i); return $"C{i}";});
var sequence_D = AsyncEnumerable.Range(1, 5)
.SelectAwait(async i =>{ await Task.Delay(i); return $"D{i}";});
await foreach (var item in seq) Console.WriteLine(item);
This produces different, mixed results each time:
B1
A1
C1
D1
D2
A2
B2
C2
D3
A3
B3
C3
C4
A4
B4
D4
D5
A5
B5
C5
The method's comments explain it was reimplemented to be cheaper and fairer:
//
// This new implementation of Merge differs from the original one in a few ways:
//
// - It's cheaper because:
// - no conversion from ValueTask<bool> to Task<bool> takes place using AsTask,
// - we don't instantiate Task.WhenAny tasks for each iteration.
// - It's fairer because:
// - the MoveNextAsync tasks are awaited concurently, but completions are queued,
// instead of awaiting a new WhenAny task where "left" sources have preferential
// treatment over "right" sources.
//
I am posting one more answer, because I noticed some other minor defects in the current¹ AsyncEnumerableEx.Merge
implementation that I would like to fix:
AsyncEnumerableEx.Merge
implementation not very suitable for producer-consumer scenarios, where processing all the consumed elements is mandatory.MoveNextAsync
operations of the source enumerators are not canceled.Dispose
or on DisposeAsync
. Nevertheless the AsyncEnumerableEx.Merge
implementation propagates normal operational errors (errors thrown by the MoveNextAsync
) from the finally
block. The MergeEx
implementation below is an attempt to fix these problems. It is a concurrent and non-destructive implementation, that propagates all the consumed values. All the errors that are caught are preserved, and are propagated in an AggregateException
.
/// <summary>
/// Merges elements from all source sequences, into a single interleaved sequence.
/// </summary>
public static IAsyncEnumerable<TSource> MergeEx<TSource>(
params IAsyncEnumerable<TSource>[] sources)
{
ArgumentNullException.ThrowIfNull(sources);
sources = sources.ToArray(); // Defensive copy.
if (sources.Any(s => s is null)) throw new ArgumentException(
$"The {nameof(sources)} argument included a null value.", nameof(sources));
return Implementation();
async IAsyncEnumerable<TSource> Implementation(
[EnumeratorCancellation] CancellationToken cancellationToken = default)
{
if (sources.Length == 0) yield break;
cancellationToken.ThrowIfCancellationRequested();
using var linkedCts = CancellationTokenSource
.CreateLinkedTokenSource(cancellationToken);
List<(IAsyncEnumerator<TSource>, Task<bool> MoveNext)> state = new();
List<Exception> errors = new();
try
{
// Create enumerators and initial MoveNextAsync tasks.
foreach (var source in sources)
{
IAsyncEnumerator<TSource> enumerator;
Task<bool> moveNext;
try { enumerator = source.GetAsyncEnumerator(linkedCts.Token); }
catch (Exception ex) { errors.Add(ex); break; }
try { moveNext = enumerator.MoveNextAsync().AsTask(); }
catch (Exception ex) { moveNext = Task.FromException<bool>(ex); }
state.Add((enumerator, moveNext));
}
bool cancelationOccurred = false;
// Loop until all enumerators are completed.
while (state.Count > 0)
{
int completedIndex = -1;
for (int i = 0; i < state.Count; i++)
{
var status = state[i].MoveNext.Status;
if (status == TaskStatus.Faulted || status == TaskStatus.Canceled)
{
// Handle errors with priority.
completedIndex = i;
break;
}
else if (status == TaskStatus.RanToCompletion)
{
// Handle completion in order.
if (completedIndex == -1) completedIndex = i;
continue;
}
}
if (completedIndex == -1)
{
// All MoveNextAsync tasks are currently in-flight.
await Task.WhenAny(state.Select(e => e.MoveNext))
.ConfigureAwait(false);
continue;
}
var (enumerator, moveNext) = state[completedIndex];
Debug.Assert(moveNext.IsCompleted);
(TSource Value, bool HasValue) item;
try
{
bool moved = await moveNext.ConfigureAwait(false);
item = moved ? (enumerator.Current, true) : default;
}
catch (OperationCanceledException)
when (linkedCts.IsCancellationRequested)
{
// Cancellation from the linked token source is not an error.
item = default; cancelationOccurred = true;
}
catch (Exception ex)
{
errors.Add(ex); linkedCts.Cancel();
item = default;
}
if (item.HasValue)
yield return item.Value;
if (item.HasValue && errors.Count == 0)
{
try { moveNext = enumerator.MoveNextAsync().AsTask(); }
catch (Exception ex) { moveNext = Task.FromException<bool>(ex); }
// Deprioritize the selected enumerator.
state.RemoveAt(completedIndex);
state.Add((enumerator, moveNext));
}
else
{
// The selected enumerator has completed or an error has occurred.
state.RemoveAt(completedIndex);
try { await enumerator.DisposeAsync().ConfigureAwait(false); }
catch (Exception ex) { errors.Add(ex); linkedCts.Cancel(); }
}
}
if (errors.Count > 0)
throw new AggregateException(errors);
// Propagate cancellation only if it occurred during the loop.
if (cancelationOccurred)
cancellationToken.ThrowIfCancellationRequested();
}
finally
{
// The finally runs when an enumerator created by this method is disposed.
// Cancel any active enumerators, for more responsive completion.
// Prevent fire-and-forget, otherwise the DisposeAsync() might throw.
// Suppress MoveNextAsync errors, but propagate DisposeAsync errors.
errors.Clear();
try { linkedCts.Cancel(); }
catch (Exception ex) { errors.Add(ex); }
foreach (var (enumerator, moveNext) in state)
{
if (!moveNext.IsCompleted)
{
try { await moveNext.ConfigureAwait(false); } catch { }
}
try { await enumerator.DisposeAsync().ConfigureAwait(false); }
catch (Exception ex) { errors.Add(ex); }
}
if (errors.Count > 0)
throw new AggregateException(errors);
}
}
}
¹ System.Interactive.Async version 6.0.1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.