简体   繁体   English

C#AsyncEnumerable运行/等待多个任务从未完成

[英]C# AsyncEnumerable running/awaiting multiple tasks never finishes

I want to have a function which receives a Task<bool> and running it in X tasks. 我想要一个接收Task<bool>并在X任务中运行的函数。

For that I've wrote the below code: 为此,我编写了以下代码:

public static class RetryComponent
{
    public static async Task RunTasks(Func<Task<bool>> action, int tasks, int retries, string method)
    {
        // Running everything
        var tasksPool = Enumerable.Range(0, tasks).Select(i => DoWithRetries(action, retries, method)).ToArray();
        await Task.WhenAll(tasksPool);
    }

    private static async Task<bool> DoWithRetries(Func<Task<bool>> action, int retryCount, string method)
    {
        while (true)
        {
            if (retryCount <= 0)
                return false;

            try
            {
                bool res = await action();
                if (res)
                    return true;
            }
            catch (Exception e)
            {
                // Log it
            }

            retryCount--;
            await Task.Delay(200); // retry in 200
        }
    }
}

And the following execution code: 和以下执行代码:

BlockingCollection<int> ints = new BlockingCollection<int>();
foreach (int i in Enumerable.Range(0, 100000))
{
    ints.Add(i);
}
ints.CompleteAdding();

int taskId = 0;
var enumerable = new AsyncEnumerable<int>(async yield =>
{
    await RetryComponent.RunTasks(async () =>
    {
        try
        {
            int myTaskId = Interlocked.Increment(ref taskId);

            // usually there are async/await operations inside the while loop, this is just an example

            while (!ints.IsCompleted)
            {
                int number = ints.Take();

                Console.WriteLine($"Task {myTaskId}: {number}");
                await yield.ReturnAsync(number);
            }
        }
        catch (InvalidOperationException)
        {
            return true;
        }
        catch (Exception e)
        {
            Console.WriteLine(e);
            throw;
        }

        return true;
    }, 10, 1, MethodBase.GetCurrentMethod().Name);
});

await enumerable.ForEachAsync(number =>
{
    Console.WriteLine(number);
});

where AsyncEnumerable is from System.Collections.Async . 其中AsyncEnumerable来自System.Collections.Async

The console shows Task 10: X (where x is a number in the list..). 控制台显示任务10:X(其中x是列表中的数字..)。

When I remove the AsyncEnumerable everything works as intended (all tasks are printing and the execution ends).. For some reason, which I cannot find for a lot of time, using AsyncEnumerable just ruins everything (In my main code, I need it to use AsyncEnumerable .. scalability stuff..) meaning that code never stops and only the last task (10) is printing. 当我删除AsyncEnumerable一切都按预期工作(所有任务都在打印中并且执行结束)。由于某种原因(我找不到很长时间),使用AsyncEnumerable破坏一切(在我的主代码中,我需要它来使用AsyncEnumerable ..可伸缩性内容..)意味着代码永远不会停止,只有最后一个任务(10)正在打印。 when i added more logs, i see that tasks 1-9 never finish. 当我添加更多日志时,我看到任务1-9永远不会完成。

So just to clear things up, I want to have multiple tasks doing async operations and yield the results to a single AsyncEnumerable object which acts as a pipe. 因此,为了弄清楚事情,我想让多个任务执行异步操作,并将结果产生给充当管道的单个AsyncEnumerable对象。 (this was the idea..) (这就是主意。)

The problem is that the enumerator/generator pattern is sequential, but you're trying to do a multi-producer, single consumer pattern. 问题在于枚举器/生成器模式是顺序的,但是您正在尝试创建多生产者,单消费者模式。 Since you use nested anonymous functions, and stack overflow doesn't show line numbers, it's hard to describe exactly which part of the code I'm referring to, but I'll try anyway. 由于您使用嵌套的匿名函数,并且堆栈溢出不会显示行号,因此很难准确描述我要指代的代码的哪一部分,但是无论如何我都会尝试。

The way that AsyncEnumerable works is basically to wait for the producer to produce a value, then wait for the consumer to use the value, then repeat. AsyncEnumerable的工作方式基本上是等待生产者产生一个值,然后等待使用者使用该值,然后重复。 It does not support the producer and consumer running at different speeds, hence why I say this pattern is sequential. 它不支持生产者和消费者以不同的速度运行,因此为什么我说这种模式是连续的。 It does not have a queue of produced items, only the current value . 它没有生产项目的队列, 只有当前值 ReturnAsync does not wait for the consumer to use the value, instead you are supposed to await the task that it returns, which gives you a signal that it's ready. ReturnAsync不等待使用者使用该值,而是应该等待它返回的任务,这会向您发出信号,表明已准备就绪。 Therefore we can conclude that it's not thread-safe. 因此,我们可以得出结论,它不是线程安全的。

However, RetryComponent.RunTasks runs 10 tasks in parallel and that code calls yield.ReturnAsync without checking if anyone else has already called it and if so if that task is complete. 但是, RetryComponent.RunTasks并行运行10个任务,该代码调用yield.ReturnAsync而不检查是否有人已经调用它,以及是否已经完成该任务。 Since the Yield class only stores the current value, your 10 concurrent tasks overwrite the current value without waiting for the Yield object to be ready for a new value, so 9 of the tasks are lost and are never awaited. 由于Yield类仅存储当前值,因此您的10个并发任务会覆盖当前值,而无需等待Yield对象准备好新值,因此9个任务会丢失并且永远不会等待。 Since those 9 tasks are never awaited, the methods never complete and Task.WhenAll never returns, and neither do any of the other methods in the entire call stack. 由于这9个任务从未等待,因此方法永远不会完成,而Task.WhenAll永远不会返回,并且整个调用堆栈中的任何其他方法也不会执行。

I created an issue on github proposing they improve their library to throw exceptions when this happens. 我在github上创建了一个问题,提议他们改进其库以在发生这种情况时引发异常。 If they implement it, your catch block would write the message to the console and rethrow the error, putting the task in a faulted state, which would allow task.WhenAll to complete and therefore your program wouldn't have hung. 如果他们实现了,则catch块会将消息写入控制台并重新抛出错误,使任务处于故障状态,这将允许task.WhenAll完成,因此程序不会挂起。

You could use multi-threaded synchronization APIs to ensure only one task at a time calls yield.ReturnAsync and await the return task. 您可以使用多线程同步API来确保一次仅调用一项任务yield.ReturnAsync并等待返回任务。 Or you could avoid using a multi-producer pattern as a single producer can be an enumerator easily. 或者您可以避免使用多生产者模式,因为单个生产者可以轻松地成为枚举器。 Otherwise you'll need to completely rethink how you want to implement the multi-producer pattern. 否则,您将需要完全重新考虑如何实现多生产者模式。 I suggest TPL Dataflow which is built-in to .NET Core and available in the .NET Framework as a NuGet package. 我建议TPL Dataflow内置于.NET Core中,并作为NuGet包在.NET Framework中提供。

@zivkan is absolutely right about the sequential producer pattern. @zivkan关于顺序生产者模式绝对正确。 If you want to have concurrent producers that for a single stream, it is still possible to implement with the AsyncEnumerable library, but requires some extra code. 如果要为单个流拥有并发生产者,仍然可以使用AsyncEnumerable库来实现,但是需要一些额外的代码。

Here is an example of a solution for the problem with concurrent producers and consumers (only one consumer in this case): 这是并发的生产者和使用者(在这种情况下,只有一个使用者)的问题解决方案示例:

        static void Main(string[] args)
        {
            var e = new AsyncEnumerable<int>(async yield =>
            {
                var threadCount = 10;
                var maxItemsOnQueue = 20;

                var queue = new ConcurrentQueue<int>();
                var consumerLimiter = new SemaphoreSlim(initialCount: 0, maxCount: maxItemsOnQueue + 1);
                var produceLimiter = new SemaphoreSlim(initialCount: maxItemsOnQueue, maxCount: maxItemsOnQueue);

                // Kick off producers
                var producerTasks = Enumerable.Range(0, threadCount)
                    .Select(index => Task.Run(() => ProduceAsync(queue, produceLimiter, consumerLimiter)));

                // When production ends, send a termination signal to the consumer.
                var endOfProductionTask = Task.WhenAll(producerTasks).ContinueWith(_ => consumerLimiter.Release());

                // The consumer loop.
                while (true)
                {
                    // Wait for an item to be produced, or a signal for the end of production.
                    await consumerLimiter.WaitAsync();

                    // Get a produced item.
                    if (queue.TryDequeue(out var item))
                    {
                        // Tell producers that they can keep producing.
                        produceLimiter.Release();
                        // Yield a produced item.
                        await yield.ReturnAsync(item);
                    }
                    else
                    {
                        // If the queue is empty, the production is over.
                        break;
                    }
                }
            });

            e.ForEachAsync((item, index) => Console.WriteLine($"{index + 1}: {item}")).Wait();
        }

        static async Task ProduceAsync(ConcurrentQueue<int> queue, SemaphoreSlim produceLimiter, SemaphoreSlim consumerLimiter)
        {
            var rnd = new Random();
            for (var i = 0; i < 10; i++)
            {
                await Task.Delay(10);
                var value = rnd.Next();

                await produceLimiter.WaitAsync(); // Wait for the next production slot
                queue.Enqueue(value); // Produce item on the queue
                consumerLimiter.Release(); // Notify the consumer
            }
        }

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM