简体   繁体   English

异步生产者/消费者

[英]Async Producer/Consumer

I have a instance of a class that is accessed from several threads. 我有一个从几个线程访问的类的实例。 This class take this calls and add a tuple into a database. 此类接受此调用并将元组添加到数据库中。 I need this to be done in a serial manner, as due to some db constraints, parallel threads could result in an inconsistent database. 我需要以串行方式完成此操作,因为由于某些db约束,并行线程可能导致数据库不一致。

As I am new to parallelism and concurrency in C#, I did this: 由于我是C#中并行性和并发性的新手,我这样做了:

private BlockingCollection<Task> _tasks = new BlockingCollection<Task>();

public void AddDData(string info)
{
    Task t = new Task(() => { InsertDataIntoBase(info); });
    _tasks.Add(t);
}

private void InsertWorker()
{
    Task.Factory.StartNew(() =>
    {
        while (!_tasks.IsCompleted)
        {
            Task t;
            if (_tasks.TryTake(out t))
            {
                t.Start();
                t.Wait();
            }
        }
    });
}

The AddDData is the one who is called by multiple threads and InsertDataIntoBase is a very simple insert that should take few milliseconds. AddDData是由多个线程调用的,而InsertDataIntoBase是一个非常简单的插入,应该需要几毫秒。

The problem is that, for some reason that my lack of knowledge doesn't allow me to figure out, sometimes a task is been called twice! 问题在于,由于某种原因,我的知识缺乏使我无法弄清楚,有时候任务被调用两次! It always goes like this: 它总是这样:

T1 T2 T3 T1 <- PK error. T1 T2 T3 T1 < - PK错误。 T4 ... T4 ......

Did I understand .Take() completely wrong, am I missing something or my producer/ consumer implementation is really bad? 我明白了.Take()完全错了,我错过了什么,或者我的生产者/消费者实施真的很糟糕?

Best Regards, Rafael 最诚挚的问候,拉斐尔

UPDATE: 更新:

As suggested, I made a quick sandbox test implementation with this architecture and as I was suspecting, it does not guarantee that a task will not be fired before the previous one finishes. 正如所建议的那样,我使用这种架构进行了快速沙盒测试实现,正如我怀疑的那样,它并不能保证在前一个任务完成之前不会触发任务。

在此输入图像描述

So the question remains: how to properly queue tasks and fire them sequentially? 所以问题仍然存在:如何正确排队任务并按顺序启动它们?

UPDATE 2: 更新2:

I simplified the code: 我简化了代码:

private BlockingCollection<Data> _tasks = new BlockingCollection<Data>();

public void AddDData(Data info)
{
    _tasks.Add(info);
}

private void InsertWorker()
{
    Task.Factory.StartNew(() =>
    {
        while (!_tasks.IsCompleted)
        {
            Data info;
            if (_tasks.TryTake(out info))
            {
                InsertIntoDB(info);
            }
        }
    });
}

Note that I got rid of Tasks as I'm relying on synced InsertIntoDB call (as it is inside a loop), but still no luck... The generation is fine and I'm absolutely sure that only unique instances are going to the queue. 注意我摆脱了任务,因为我依赖于同步的InsertIntoDB调用(因为它在循环中),但仍然没有运气......这一代很好,我绝对相信只有唯一的实例会进入队列。 But no matter I try, sometimes the same object is used twice. 但无论我尝试,有时同一个对象被使用两次。

I think this should work: 我认为这应该有效:

    private static BlockingCollection<string> _itemsToProcess = new BlockingCollection<string>();

    static void Main(string[] args)
    {
        InsertWorker();
        GenerateItems(10, 1000);
        _itemsToProcess.CompleteAdding();
    }

    private static void InsertWorker()
    {
        Task.Factory.StartNew(() =>
        {
            while (!_itemsToProcess.IsCompleted)
            {
                string t;
                if (_itemsToProcess.TryTake(out t))
                {
                    // Do whatever needs doing here
                    // Order should be guaranteed since BlockingCollection 
                    // uses a ConcurrentQueue as a backing store by default.
                    // http://msdn.microsoft.com/en-us/library/dd287184.aspx#remarksToggle
                    Console.WriteLine(t);
                }
            }
        });
    }

    private static void GenerateItems(int count, int maxDelayInMs)
    {
        Random r = new Random();
        string[] items = new string[count];

        for (int i = 0; i < count; i++)
        {
            items[i] = i.ToString();
        }

        // Simulate many threads adding items to the collection
        items
            .AsParallel()
            .WithDegreeOfParallelism(4)
            .WithExecutionMode(ParallelExecutionMode.ForceParallelism)
            .Select((x) =>
            {
                Thread.Sleep(r.Next(maxDelayInMs));
                _itemsToProcess.Add(x);
                return x;
            }).ToList();
    }

This does mean that the consumer is single threaded, but allows for multiple producer threads. 这确实意味着使用者是单线程的,但允许多个生产者线程。

From your comment 从你的评论

"I simplified the code shown here, as the data is not a string" “我简化了此处显示的代码,因为数据不是字符串”

I assume that info parameter passed into AddDData is a mutable reference type. 我假设传递给AddDData的info参数是一个可变的引用类型。 Make sure that the caller is not using the same info instance for multple calls since that reference is captured in Task lambda . 确保调用者没有为多个调用使用相同的info实例,因为该引用是在任务lambda中捕获的。

Based on the trace that you provided the only logical possibility is that you have called InsertWorker twice (or more). 根据您提供的跟踪,唯一合乎逻辑的可能性是您已将InsertWorker调用两次(或更多次)。 There are thus two background threads waiting for items to appear in the collection and occasionally they both manage to grab an item and begin executing it. 因此,有两个后台线程等待项目出现在集合中,偶尔它们都设法抓取一个项目并开始执行它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM