简体   繁体   English

如何使用 NoBuffering 运行 Parallel.ForEachAsync 循环?

[英]How to run a Parallel.ForEachAsync loop with NoBuffering?

The synchronous Parallel.ForEach method has many overloads, and some of them allow to configure the parallel loop with the EnumerablePartitionerOptions.NoBuffering option:同步Parallel.ForEach方法有许多重载,其中一些允许使用EnumerablePartitionerOptions.NoBuffering选项配置并行循环:

Create a partitioner that takes items from the source enumerable one at a time and does not use intermediate storage that can be accessed more efficiently by multiple threads.创建一个分区器,它一次从可枚举的源中获取项目,并且不使用可以由多个线程更有效地访问的中间存储。 This option provides support for low latency (items will be processed as soon as they are available from the source) and provides partial support for dependencies between items (a thread cannot deadlock waiting for an item that the thread itself is responsible for processing).此选项提供对低延迟的支持(项目将在源可用时立即处理)并为项目之间的依赖关系提供部分支持(线程不能死锁等待线程本身负责处理的项目)。

There is no such option or overload for the asynchronous Parallel.ForEachAsync .异步Parallel.ForEachAsync没有这样的选项或重载。 And this is a problem for me, because I want to use this method with a Channel<T> as source, in a producer-consumer scenario as the consumer.这对我来说是个问题,因为我想在作为消费者的生产者-消费者场景中使用这个方法,将Channel<T>作为源。 In my scenario it is important that the consumer bites exactly what it can chew, and no more.在我的场景中,重要的是消费者只咬它可以咀嚼的东西,而不是更多。 I don't want the consumer pulling aggressively the Channel<T> , and then putting the pulled elements in its personal hidden buffer.我不希望消费者积极地拉动Channel<T> ,然后将拉动的元素放入其个人隐藏缓冲区中。 I want the Channel<T> to be the only queue in the system, so that I can monitor it, and have accurate statistics about the elements that are waiting to be processed/consumed.我希望Channel<T>成为系统中唯一的队列,以便我可以对其进行监控,并获得有关等待处理/使用的元素的准确统计信息。

Until recently I was under the impression that the Parallel.ForEachAsync method is not buffering by design.直到最近,我的印象是Parallel.ForEachAsync方法没有设计缓冲。 But in order to be sure I asked Microsoft on GitHub for a clarification.但为了确定,我在 GitHub 上向微软询问了澄清。 I got feedback very quickly, but not what I expected:我很快就得到了反馈,但不是我所期望的:

It's an implementation detail.这是一个实现细节。 With Parallel.ForEach , the buffering is done to handle body delegates that might be really fast, and thus it's attempting to minimize / amortize the cost of taking the lock to access the shared enumerator.使用Parallel.ForEach ,缓冲是为了处理可能非常快的主体委托,因此它试图最小化/摊销获取锁以访问共享枚举器的成本。 With ForEachAsync , it's expected that the body delegates will be at least a bit meatier, and thus it doesn't attempt to do such amortization.使用ForEachAsync ,预计主体代表至少会更丰富一些,因此它不会尝试进行这种摊销。 At least not today.至少今天不是。

Being dependent on an implementation detail is highly undesirable.依赖于实现细节是非常不可取的。 So I have to rethink my approach.所以我必须重新考虑我的方法。

My question is: Is it possible to configure the Parallel.ForEachAsync API so that is has guaranteed NoBuffering behavior?我的问题是:是否可以配置Parallel.ForEachAsync API 以保证NoBuffering行为? If yes, how?如果是,如何?

Clarification: I am not asking how to reinvent the Parallel.ForEachAsync from scratch.澄清:我不是在问如何从头开始重新发明Parallel.ForEachAsync I am asking for some kind of thin wrapper around the existing Parallel.ForEachAsync API, that will "inject" the desirable NoBuffering behavior.我要求在现有的Parallel.ForEachAsync API 周围使用某种薄包装器,这将“注入”理想的NoBuffering行为。 Something like this:像这样的东西:

public static Task ForEachAsync_NoBuffering<TSource>(
    IAsyncEnumerable<TSource> source,
    ParallelOptions parallelOptions,
    Func<TSource, CancellationToken, ValueTask> body)
{
    // Some magic here
    return Parallel.ForEachAsync(source, parallelOptions, body);
}

The wrapper should behave exactly the same with the Parallel.ForEachAsync method on .NET 6.包装器的行为应与 .NET 6 上的Parallel.ForEachAsync方法完全相同。


Update: Here is the basic layout of my scenario:更新:这是我的场景的基本布局:

class Processor
{
    private readonly Channel<Item> _channel;
    private readonly Task _consumer;

    public Processor()
    {
        _channel = Channel.CreateUnbounded<Item>();
        _consumer = StartConsumer();
    }

    public int PendingItemsCount => _channel.Reader.Count;
    public Task Completion => _consumer;

    public void QueueItem(Item item) => _channel.Writer.TryWrite(item);

    private async Task StartConsumer()
    {
        ParallelOptions options = new() { MaxDegreeOfParallelism = 2 };
        await Parallel.ForEachAsync(_channel.Reader.ReadAllAsync(), options, async (item, _) =>
        {
            // Call async API
            // Persist the response of the API in an RDBMS
        });
    }
}

There might be other tools available that could also be used for this purpose, but I prefer to use the smoking hot (.NET 6) Parallel.ForEachAsync API.可能还有其他可用的工具也可用于此目的,但我更喜欢使用炙手可热的 (.NET 6) Parallel.ForEachAsync API。 This is the focus of this question.这是这个问题的重点。

I think that I've found a way to implement the ForEachAsync_NoBuffering method.我认为我已经找到了实现ForEachAsync_NoBuffering方法的方法。 The idea is to feed the underlying Parallel.ForEachAsync loop with a fake infinite IEnumerable<TSource> , and do the actual enumeration of the IAsyncEnumerable<TSource> source inside the body :这个想法是为底层的Parallel.ForEachAsync循环提供一个假的无限IEnumerable<TSource> ,并在body中对IAsyncEnumerable<TSource> source进行实际枚举:

/// <summary>
/// Executes a for-each operation on an asynchronous sequence, in which iterations
/// may run in parallel. Items are taken from the source sequence one at a time,
/// and no intermediate storage is used.
/// </summary>
public static Task ForEachAsync_NoBuffering<TSource>(
    IAsyncEnumerable<TSource> source,
    ParallelOptions parallelOptions,
    Func<TSource, CancellationToken, ValueTask> body)
{
    ArgumentNullException.ThrowIfNull(source);
    ArgumentNullException.ThrowIfNull(parallelOptions);
    ArgumentNullException.ThrowIfNull(body);
    bool completed = false;
    IEnumerable<TSource> Infinite()
    {
        while (!Volatile.Read(ref completed)) yield return default;
    }
    SemaphoreSlim semaphore = new(1, 1);
    IAsyncEnumerator<TSource> enumerator = source.GetAsyncEnumerator();
    return Parallel.ForEachAsync(Infinite(), parallelOptions, async (_, ct) =>
    {
        // Take the next item in the sequence, after acquiring an exclusive lock.
        TSource item;
        await semaphore.WaitAsync(); // Continue on captured context.
        try
        {
            if (completed) return;
            if (!(await enumerator.MoveNextAsync())) // Continue on captured context.
            {
                completed = true; return;
            }
            item = enumerator.Current;
        }
        finally { semaphore.Release(); }
        // Invoke the body with the item that was taken.
        await body(item, ct).ConfigureAwait(false);
    }).ContinueWith(async t =>
    {
        // Dispose the enumerator.
        await enumerator.DisposeAsync().ConfigureAwait(false);
        semaphore.Dispose();
        return t;
    }, default, TaskContinuationOptions.DenyChildAttach |
        TaskContinuationOptions.ExecuteSynchronously, TaskScheduler.Default)
        .Unwrap().Unwrap();
}

The final ContinueWith is needed in order to dispose the enumerator, as well as the SemaphoreSlim that is used for serializing the operations on the enumerator.需要最后的ContinueWith来处理枚举器,以及用于序列化枚举器上的操作的SemaphoreSlim The advantage of the ContinueWith over a simpler await is that it propagates all the exceptions of the parallel loop. ContinueWith与更简单的await相比的优势在于它传播了并行循环的所有异常。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM