使用 GetConsumingEnumerable() 在 C# BlockingCollection 中某处丢失项目

Question

I'm trying to do a parallel SqlBulkCopy to multiple targets over WAN, many of which may be having slow connections and/or connection cutoffs;我正在尝试通过 WAN 对多个目标执行并行 SqlBulkCopy，其中许多目标的连接速度可能很慢和/或连接中断； their connection speed varies from 2 to 50 mbits download, and I am sending from a connection with 1000 mbit upload;他们的连接速度从 2 到 50 兆位的下载不等，我从 1000 兆位上传的连接发送； a lot of the targets need multiple retries to correctly finish.许多目标需要多次重试才能正确完成。

I'm currently using a Parallel.ForEach on the GetConsumingEnumerable() of a BlockingCollection ( queue );我目前在 BlockingCollection ( queue ) 的GetConsumingEnumerable()上使用Parallel.ForEach ； however I either stumbled upon some bug, or I am having problems fully understanding its purpose, or simply got something wrong.. The code never calls the CompleteAdding() method of the blockingcollection, it seems that somewhere in the parallel-foreach-loop some of the targets get lost.但是我或者偶然发现了一些错误，或者我在完全理解它的目的时遇到了问题，或者只是出错了。代码从不调用blockingcollection的CompleteAdding()方法，似乎在parallel-foreach-loop中的某个地方的目标丢失。 Even if there are different approaches to this, and disregarding the kind of work it is doing in the loop, the blockingcollection shouldn't behave the way it does in this example, should it?即使对此有不同的方法，并且忽略它在循环中所做的工作类型，blockingcollection 也不应该像本例中那样运行，不是吗？

In the foreach-loop, I do the work, and add the target to a results -collection in case it completed successfully, or re-add the target to the BlockingCollection in case of an error until the target reached the max retries threshold;在 foreach 循环中，我完成了工作，并将目标添加到results -collection 以防它成功完成，或者将目标重新添加到 BlockingCollection 以防出错，直到目标达到最大重试阈值； at that point I add it to the results -collection.那时我将它添加到results集合中。

In an additional Task, I loop until the count of the results -collection equals the initial count of the targets;在另一个任务中，我循环直到results集合的计数等于目标的初始计数； then I do the CompleteAdding() on the blocking collection.然后我对阻塞集合执行CompleteAdding() 。

I already tried using a locking object for the operations on the results -collection (using a List<int> instead) and the queue, with no luck, but that shouldn't be necessary anyways.我已经尝试使用锁定 object 对results集合（使用List<int>代替）和队列进行操作，但没有运气，但这无论如何都不是必需的。 I also tried adding the retries to a separate collection, and re-adding those to the BlockingCollection in a different Task instead of in the parallel.foreach.我还尝试将重试添加到单独的集合中，然后将它们重新添加到不同任务中的 BlockingCollection 中，而不是在 parallel.foreach 中。 Just for fun I also tried compiling with .NET from 4.5 to 4.8, and different C# language versions.只是为了好玩，我还尝试使用从 4.5 到 4.8 的 .NET 以及不同的 C# 语言版本进行编译。

Here is a simplified example:这是一个简化的示例：

List<int> targets = new List<int>();
for (int i = 0; i < 200; i++)
{
    targets.Add(0);
}

BlockingCollection<int> queue = new BlockingCollection<int>(new ConcurrentQueue<int>());
ConcurrentBag<int> results = new ConcurrentBag<int>();
targets.ForEach(f => queue.Add(f));

// Bulkcopy in die Filialen:
Task.Run(() =>
    {
        while (results.Count < targets.Count)
        {
            Thread.Sleep(2000);
            Console.WriteLine($"Completed: {results.Count} / {targets.Count} | queue: {queue.Count}");
        }
        queue.CompleteAdding();
    });

int MAX_RETRIES = 10;
ParallelOptions options = new ParallelOptions { MaxDegreeOfParallelism = 50 };

Parallel.ForEach(queue.GetConsumingEnumerable(), options, target =>
    {
        try
        {
            // simulate a problem with the bulkcopy:
            throw new Exception();
            results.Add(target);
        }
        catch (Exception)
        {
            if (target < MAX_RETRIES)
            {
                target++;
                if (!queue.TryAdd(target))
                    Console.WriteLine($"{target.ToString("D3")}: Error, can't add to queue!");
            }
            else
            {
                results.Add(target);
                Console.WriteLine($"Aborted after {target + 1} tries | {results.Count} / {targets.Count} items finished.");
            }

        }
    });

I expected the count of the results -collection to be the exact count of the targets -list in the end, but it seems to never reach that number, which results in the BlockingCollection never being marked as completed, so the code never finishes.我希望results -collection 的计数最终是targets -list 的确切计数，但它似乎永远不会达到那个数字，这导致 BlockingCollection 永远不会被标记为已完成，因此代码永远不会完成。

I really don't understand why not all of the targets get added to the results -collection eventually, The added count always varies.我真的不明白为什么不是所有的目标最终都被添加到results集合中，添加的数量总是变化的。 and is mostly just shy of the expected final count.并且大多只是低于预期的最终计数。

EDIT: I removed the retry-part, and replaced the ConcurrentBag with a simple int-counter, and it still doesn't work most of the time:编辑：我删除了重试部分，并用一个简单的 int 计数器替换了 ConcurrentBag，但它在大多数情况下仍然不起作用：

List<int> targets = new List<int>();
for (int i = 0; i < 500; i++)
    targets.Add(0);

BlockingCollection<int> queue = new BlockingCollection<int>(new ConcurrentQueue<int>());
//ConcurrentBag<int> results = new ConcurrentBag<int>();
int completed = 0;
targets.ForEach(f => queue.Add(f));

var thread = new Thread(() =>
{
    while (completed < targets.Count)
    {
        Thread.Sleep(2000);
        Console.WriteLine($"Completed: {completed} / {targets.Count} | queue: {queue.Count}");
    }
    queue.CompleteAdding();
});
thread.Start();

ParallelOptions options = new ParallelOptions { MaxDegreeOfParallelism = 4 };
Parallel.ForEach(queue.GetConsumingEnumerable(), options, target =>
{
    Interlocked.Increment(ref completed);
});

Answer 1

Parallel.ForEach is meant for data parallelism (ie processing 100K rows using all 8 cores), not concurrent operations. Parallel.ForEach用于数据并行（即使用所有 8 个内核处理 100K 行），而不是并发操作。 This is essentially a pub/sub and async problem, if not a pipeline problem.这本质上是一个发布/订阅和异步问题，如果不是管道问题的话。 There's nothing for the CPU to do in this case, just start the async operations and wait for them to complete.在这种情况下，CPU 无需执行任何操作，只需启动异步操作并等待它们完成即可。

.NET handles this since .NET 4.5 through the Dataflow classes and lately, the lower-level System.Threading.Channel namespace. .NET 从 .NET 4.5 开始通过 Dataflow 类和最近的较低级别的 System.Threading.Channel 命名空间来处理这个问题。

In its simplest form, you can create an ActionBlock<> that takes a buffer and target connection and publishes the data.以最简单的形式，您可以创建一个ActionBlock<> ，它采用缓冲区和目标连接并发布数据。 Let's say you use this method to send the data to a server:假设您使用此方法将数据发送到服务器：

async Task MyBulkCopyMethod(string connectionString,DataTable data)
{
    using(var bcp=new SqlBulkCopy(connectionString))
    {
        //Set up mappings etc.
        //....
        await bcp.WriteToServerAsync(data);   
    }
}

You can use this with an ActionBlock class with a configured degree of parallelism.您可以将其与具有已配置并行度的 ActionBlock class 一起使用。 Dataflow classes like ActionBlock have their own input, and where appropriate, output buffers, so there's no need to create a separate queue:像 ActionBlock 这样的数据流类有自己的输入，并且在适当的情况下，output 缓冲区，因此无需创建单独的队列：

class DataMessage
{
    public string Connection{get;set;}
    public DataTable Data {get;set;} 
}

... ...

var options=new ExecutionDataflowBlockOptions { 
                    MaxDegreeOfParallelism = 50,
                    BoundedCapacity = 8
            };
var block=new ActionBlock<DataMessage>(msg=>MyBulkCopyMethod(msg.Connection,msg.Data, options);

We can start posting messages to the block now.我们现在可以开始向区块发布消息了。 By setting the capacity to 8 we ensure the input buffer won't get filled with large messages if the block is too slow.通过将容量设置为 8，我们可以确保如果块太慢，输入缓冲区不会被大消息填满。 MaxDegreeOfParallelism controls how may operations run concurrently. MaxDegreeOfParallelism控制操作如何同时运行。 Let's say we want to send the same data to many servers:假设我们想将相同的数据发送到许多服务器：

var data=.....;
var servers=new[]{connString1, connString2,....};
var messages= from sv in servers
              select new DataMessage{ ConnectionString=sv,Table=data};

foreach(var msg in messages)
{
    await block.SendAsync(msg);
}
//Tell the block we are done
block.Complete();
//Await for all messages to finish processing
await block.Completion;

Retries重试

One possibility for retries is to use a retry loop in the worker function.重试的一种可能性是在工作程序 function 中使用重试循环。 A better idea would be to use a different block and post failed messages there.一个更好的主意是使用不同的块并在那里发布失败的消息。

var block=new ActionBlock<DataMessage>(async msg=> {
    try {
        await MyBulkCopyMethod(msg.Connection,msg.Data, options);
    }
    catch(SqlException exc) when (some retry condition)
    {
        //Post without awaiting
        retryBlock.Post(msg);
    });

When the original block completes we want to tell the retry block to complete as well, no matter what:当原始块完成时，我们想告诉重试块也完成，无论如何：

block.Completion.ContinueWith(_=>retryBlock.Complete());

Now we can await the retryBlock to complete.现在我们可以等待retryBlock完成。

That block could have a smaller DOP and perhaps a delay between attempts:该块可能具有较小的 DOP，并且尝试之间可能存在延迟：

var retryOptions=new ExecutionDataflowBlockOptions { 
                MaxDegreeOfParallelism = 5
        };
var retryBlock=new ActionBlock<DataMessage>(async msg=>{
    await Task.Delay(1000);
    try {
        await MyBulkCopyMethod(msg.Connection,msg.Data, options);
    }
    catch (Exception ....)
    {
        ...
    }
});

This pattern can be repeated to create multiple levels of retry, or different conditions.可以重复此模式以创建多个重试级别或不同的条件。 It can also be used to create different priority workers by giving a larger DOP to high priority workers, or a larger delay to low priority workers它还可以用于创建不同的优先级工作人员，方法是为高优先级工作人员提供更大的 DOP，或为低优先级工作人员提供更大的延迟

Answer 2

Sorry, found the answer: the default partitioner used by blockingcollection and parallel foreach is chunking and buffering, which results in the foreach loop to forever wait for enough items for the next chunk.. for me, it sat there for a whole night, without processing the last few items!抱歉，找到了答案：blockingcollection 和并行 foreach 使用的默认分区器是分块和缓冲，这导致 foreach 循环永远等待下一个块的足够项目.. 对我来说，它在那里坐了一整夜，没有处理最后几项！

So, instead of:所以，而不是：

ParallelOptions options = new ParallelOptions { MaxDegreeOfParallelism = 4 };
Parallel.ForEach(queue.GetConsumingEnumerable(), options, target =>
{
    Interlocked.Increment(ref completed);
});

you have to use:你必须使用：

var partitioner = Partitioner.Create(queue.GetConsumingEnumerable(), EnumerablePartitionerOptions.NoBuffering);
ParallelOptions options = new ParallelOptions { MaxDegreeOfParallelism = 4 };
Parallel.ForEach(partitioner, options, target =>
{
    Interlocked.Increment(ref completed);
});

使用 GetConsumingEnumerable() 在 C# BlockingCollection 中某处丢失项目

问题描述

2 个解决方案

解决方案1
1 已采纳 2019-10-21 14:46:50

解决方案2
0 2019-10-21 14:45:46

使用 GetConsumingEnumerable() 在 C# BlockingCollection 中某处丢失项目

问题描述

2 个解决方案

解决方案1 1 已采纳 2019-10-21 14:46:50

解决方案2 0 2019-10-21 14:45:46

解决方案1
1 已采纳 2019-10-21 14:46:50

解决方案2
0 2019-10-21 14:45:46