繁体   English   中英

如何在 C# 中使用多线程进行批处理

[英]How to do batch processing using multi-threading in C#

我正在使用并行 foreach 在阻塞集合中添加值,但是当阻塞集合有 10 个值时,我需要对其进行一些处理,然后再次清除该阻塞集合,然后再次开始向阻塞集合添加值。

这里有两个问题

  1. 虽然我正在做一些处理,它将继续向阻塞集合添加值,我可以在列表上加一个锁,但当它到达锁时,值会增加。

  2. 如果我放置的锁完全破坏了并行编程的使用,我希望在该列表中添加 object 直到这 10 条消息被处理。 我可以在这里复制列表内容并再次清空列表,同样的问题我不能只复制 10 个项目,因为内容已经更改。

有时 if 条件永远不会满足,因为在检查条件之前,值会增加。

有什么解决办法吗?

public static BlockingCollection<string> failedMessages = new BlockingCollection<string>();
static void Main(string[] args)
{
    var myCollection = new List<string>();
    myCollection.Add("test");
    //Consider myCollection having more than 100 items 
    Parallel.ForEach(myCollection, item =>
    {
        failedMessages.Add(item);
        if (failedMessages.Count == 10)
        {
            DoSomething();
        }
    });

}

static public void DoSomething()
{
    //dosome operation with failedMessages 
    failedMessages = new BlockingCollection<string>();
}   

    

这看起来像是 DataFlow 的工作:

使用批大小为 10 的BatchBlock<string>ActionBlock<string[]>来消耗批次的示例:

using System;
using System.Threading.Tasks;
using System.Threading.Tasks.Dataflow;
                    
public class Program
{
    public static void Main()
    {
        Console.WriteLine("Hello World");
        // Set up DataFlow Blocks
        BatchBlock<string> batcher = new BatchBlock<string>( 10 );
        ActionBlock<string[]> consumer = 
            new ActionBlock<string[]>( 
                (msgs) => Console.WriteLine("Processed {0} messages.", msgs.Length)
            );
        // put them together
        batcher.LinkTo( consumer );
        
        // start posting
        Parallel.For( 0, 103, (i) => batcher.Post(string.Format("Test {0}",i)));
        
        // shutdown
        batcher.Complete();
        batcher.Completion.Wait();
    }
}

在行动: https://dotnetfiddle.net/Y9Ezg4

进一步阅读: https://docs.microsoft.com/en-us/dotnet/standard/parallel-programming/walkthrough-using-batchblock-and-batchedjoinblock-to-improve-efficiency


编辑:根据要求 - 如果您不能或不想使用 DataFlow,您当然可以做类似的事情:

using System;
using System.Threading;
using System.Threading.Tasks;
using System.Collections.Generic;
using System.Collections.Concurrent;
using System.Linq;
                    
public class Program
{
    public static void Main()
    {
        FailedMessageHandler fmh = new FailedMessageHandler( new Progress<string[]>((list) => { Console.WriteLine("Handling {0} messages. [{1}]", list.Length, string.Join(",", list));}));
        Parallel.For(0,52, (i) => {fmh.Add(string.Format("Test {0,3}",i));});
        Thread.Sleep(1500); // Demo: Timeout
        var result = Parallel.For(53,107, (i) => {fmh.Add(string.Format("Test {0,3}",i));});
        while(!result.IsCompleted)
        {
            // Let Parallel.For run to end ...
            Thread.Sleep(10);
        }
        // Graceful shutdown:
        fmh.CompleteAdding();
        fmh.AwaitCompletion();
    }
}

public class FailedMessageHandler
{
    private BlockingCollection<string> workQueue = new BlockingCollection<string>();
    private List<string> currentBuffer = new List<string>(10);
    private IProgress<string[]> progress;
    private Thread workThread;
    
    public FailedMessageHandler( IProgress<string[]> progress )
    {
        this.progress = progress;
        workThread = new Thread(WatchDog);
        workThread.Start();
    }
    
    public void Add( string failedMessage )
    {
        if ( workQueue.IsAddingCompleted )
        {
            throw new InvalidOperationException("Adding is completed!");
        }
        
        workQueue.Add(failedMessage);
    }
    
    private void WatchDog()
    {
        while(true)
        {
            // Demo: Include a timeout - If there are less than 10 items
            // for x amount of time, send whatever you got so far.
            CancellationTokenSource timeout = new CancellationTokenSource(TimeSpan.FromSeconds(1));
            try{
               var failedMsg = workQueue.Take(timeout.Token);
               currentBuffer.Add(failedMsg);
               if( currentBuffer.Count >= 10 ){
                   progress.Report(currentBuffer.ToArray());
                   currentBuffer.Clear();
               }
            }
            catch(OperationCanceledException)
            {
                Console.WriteLine("TIMEOUT!");
                // timeout.
                if( currentBuffer.Any() ) // handle items if there are
                {
                    progress.Report(currentBuffer.ToArray());
                    currentBuffer.Clear();
                }
            }
            catch(InvalidOperationException)
            {
                Console.WriteLine("COMPLETED!");
                // queue has been completed.
                if( currentBuffer.Any() ) // handle remaining items
                {
                    progress.Report(currentBuffer.ToArray());
                    currentBuffer.Clear();
                }
                break;
            }
        }
        Console.WriteLine("DONE!");
    }
    
    public void CompleteAdding()
    {
        workQueue.CompleteAdding();
    }
    
    public void AwaitCompletion()
    {
        if( workThread != null )
            workThread.Join();
    }
}

在行动: https://dotnetfiddle.net/H2Rg35

请注意,使用Progress将在主线程上执行处理。 如果您改为传递一个Action ,它将在workThread上执行。 因此,请根据您的要求调整示例。

这也只是给出一个想法,这有很多变体,可能使用Task/Async ...

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM