简体   繁体   中英

TPL Dataflow, BroadcastBlock to BatchBlocks

I have a problem connecting BroadcastBlock(s) to BatchBlocks . The scenario is that the sources are BroadcastBlocks , and recipients are BatchBlocks .

In the simplified code below, only one of the supplemental action blocks executes. I even set the batchSize for each BatchBlock to 1 to illustrate the problem.

Setting Greedy to "true" would make the 2 ActionBlocks execute, but that's not what I want as it will cause the BatchBlock to proceed even if it's not complete yet. Any ideas?

class Program
{
    static void Main(string[] args)
    {
        // My possible sources are BroadcastBlocks. Could be more
        var source1 = new BroadcastBlock<int>(z => z);

        // batch 1
        // can be many potential sources, one for now
        // I want all sources to arrive first before proceeding
        var batch1 = new BatchBlock<int>(1, new GroupingDataflowBlockOptions() { Greedy = false }); 
        var batch1Action = new ActionBlock<int[]>(arr =>
        {
            // this does not run sometimes
            Console.WriteLine("Received from batch 1 block!");
            foreach (var item in arr)
            {
                Console.WriteLine("Received {0}", item);
            }
        });

        batch1.LinkTo(batch1Action, new DataflowLinkOptions() { PropagateCompletion = true });

        // batch 2
        // can be many potential sources, one for now
        // I want all sources to arrive first before proceeding
        var batch2 = new BatchBlock<int>(1, new GroupingDataflowBlockOptions() { Greedy = false  });
        var batch2Action = new ActionBlock<int[]>(arr =>
        {
            // this does not run sometimes
            Console.WriteLine("Received from batch 2 block!");
            foreach (var item in arr)
            {
                Console.WriteLine("Received {0}", item);
            }
        });
        batch2.LinkTo(batch2Action, new DataflowLinkOptions() { PropagateCompletion = true });

        // connect source(s)
        source1.LinkTo(batch1, new DataflowLinkOptions() { PropagateCompletion = true });
        source1.LinkTo(batch2, new DataflowLinkOptions() { PropagateCompletion = true });

        // fire
        source1.SendAsync(3);

        Task.WaitAll(new Task[] { batch1Action.Completion, batch2Action.Completion }); ;

        Console.ReadLine();
    }
}

You have completely wrong understanding what the Greedy flag does. If it is equal to true , your batch blocks gathers the data even if there is no sufficient amount of data to gather into a batch. By settings Greedy = false , so say to TPL Dataflow : I will post to batch blocks, not you , so batch block may or may not to decide to get the message from broadcast block.

More over, you do block the thread by calling Task.WaitAll(new Task[] { batch1Action.Completion, batch2Action.Completion }); , as it will block main thread and threads for each of your Completion tasks. This may lead to deadlock, as threads are blocked before they able to post messages across the pipeline. Also, you do not call the source1.Complete() , so this WaitAll call will never return .

What you really need is to set Greedy to true (which is default), set the batch size to needed value (for example, 2 ), call Complete() method, and do not use thread-blocking methods for your pipeline. By doing this, your batch blocks will get all data from broadcast, but further blocks wouldn't get any data before they get all the data for batch:

var source1 = new BroadcastBlock<int>(z => z);
var options = new DataflowLinkOptions { PropagateCompletion = true };

// this block wouldn't execute, as it doesn't get the data with greedy execution
var batch1 = new BatchBlock<int>(2, new GroupingDataflowBlockOptions { Greedy = false });
var batch1Action = new ActionBlock<int[]>(arr =>
{
    Console.WriteLine("Received from batch 1 block!");
    foreach (var item in arr)
    {
        Console.WriteLine("Received {0}", item);
    }

});
batch1.LinkTo(batch1Action, options);

// this batch is freedy, so it will execute always
var batch2 = new BatchBlock<int>(2);
var batch2Action = new ActionBlock<int[]>(arr =>
{
    Console.WriteLine("Received from batch 2 block!");
    foreach (var item in arr)
    {
        Console.WriteLine("Received {0}", item);
    }
});
batch2.LinkTo(batch2Action, options);

// connect source(s)
source1.LinkTo(batch1, options);
source1.LinkTo(batch2, options);

// fire
source1.SendAsync(3);
// simulate some over work
Thread.Sleep(3000);
// complete batch, now the ActionBlock2 will execute
source1.SendAsync(3);

// if you need to wait for completion, call this method
source1.Complete();
// note that WhenAll isn't blocking task
var allTasks = Task.WhenAll(batch1Action.Completion, batch2Action.Completion);
// non-blocking wait
await allTasks;
// blocking wait
allTasks.Wait();

Console.ReadLine();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM