简体   繁体   English

在c ++中,在线程之间共享数据容器的最佳方法是什么

[英]What is the best way to share data containers between threads in c++

I have an application which has a couple of processing levels like: 我有一个应用程序有几个处理级别,如:

InputStream->Pre-Processing->Computation->OutputStream

Each of these entities run in separate thread. 每个实体都在单独的线程中运行。 So in my code I have the general thread, which owns the 所以在我的代码中我有一般的线程,它拥有

std::vector<ImageRead> m_readImages;

and then it passes this member variable to each thread: 然后它将此成员变量传递给每个线程:

InputStream input{&m_readImages};
std::thread threadStream{&InputStream::start, &InputStream};
PreProcess pre{&m_readImages};
std::thread preStream{&PreProcess::start, &PreProcess};
...

And each of these classes owns a pointer member to this data: 并且每个类都拥有此数据的指针成员:

std::vector<ImageRead>* m_ptrReadImages;

I also have a global mutex defined, which I lock and unlock on each read/write operation to that shared container. 我还定义了一个全局互斥锁,我在对该共享容器的每次读/写操作时锁定和解锁。 What bothers me is that this mechanism is pretty obscure and sometimes I get confused whether the data is used by another thread or not. 困扰我的是这种机制非常模糊,有时我会混淆数据是否被另一个线程使用。

So what is the more straightforward way to share this container between those threads? 那么在这些线程之间共享这个容器的更直接的方法是什么?

The process you described as "Input-->preprocessing-->computation-->Output" is sequential by design: each step depends on the previous one so parallelization in this particular manner is not beneficial as each thread just has to wait for another to complete. 您描述为“输入 - >预处理 - >计算 - >输出”的过程是按顺序设计的:每个步骤都依赖于前一个步骤,因此以这种特定方式并行化并不是有益的,因为每个线程只需要等待另一个去完成。 Try to find out which step takes most time and parallelize that. 试着找出哪个步骤需要花费大部分时间并将其并行化。 Or try to set up multiple parallel processing pipelines that operate sequentially on independent, individual data sets. 或者尝试设置多个并行处理流水线,这些流水线按顺序在独立的单个数据集上运行。 A usual approach for that would employ a processing queue which distributes the tasks among a set of threads. 通常的方法是使用处理队列,该队列在一组线程中分配任务。

It would seem to me that your reading and preprocessing could be done independently of the container. 在我看来,您的阅读和预处理可以独立于容器完成。

Naively, I would structure this as a fan-out and then fan-in network of tasks. 天真的,我会将其构建为扇出,然后扇入任务网络。

First, make dispatch task (a task is a unit of work that is given to a thread to actually operate) that will create input-and-preprocess tasks. 首先,make dispatch 任务 (一个任务是一个给予线程实际操作的工作单元),它将创建输入和预处理任务。

Use futures as a means for the sub-tasks to communicate back a pointer to the completely loaded image. 使用期货作为子任务的手段,以便将指针传送回完全加载的图像。

Make a second task, the std::vector builder task that just calls join on the futures to get the results when they are done and adds them to the std::vector array. 创建第二个任务,即std :: vector构建器任务 ,它只是在期货上调用join来获得结果,并将它们添加到std::vector数组中。

I suggest you structure things this way because I suspect that any IO and preprocessing you are doing will take longer than setting a value in the vector. 我建议你用这种方式构建东西,因为我怀疑你正在做的任何IO和预处理都需要比在向量中设置一个值更长的时间。 Using tasks instead of threads directly lets you tune the parallel portion of your work. 直接使用任务代替线程可以调整工作的并行部分。

I hope that's not too abstracted away from the concrete elements. 我希望不要过于抽象出具体因素。 This is a pattern I find to be well balanced between saturating available hardware, reducing thrash / lock contention, and is understandable by future-you debugging it later. 这是我发现在可用硬件饱和之间很好地平衡的模式,减少了捶打/锁争用,并且未来可以理解 - 您稍后进行调试。

I would use 3 separate queues, ready_for_preprocessing which is fed by InputStream and consumed by Pre-processing, ready_for_computation which is fed by Pre-Processing and consumed by Computation, and ready_for_output which is fed by Computation and consumed by OutputStream. 我将使用3个独立的队列, ready_for_preprocessing由InputStream提供并由Pre-processing, ready_for_computation (由Pre-Processing提供并由Computation消耗)和ready_for_output (由Computation提供并由OutputStream使用)消耗。

You'll want each queue to be in a class, which has an access mutex (to control actually adding and removing items from the queue) and an "image available" semaphore (to signal that items are available) as well as the actual queue. 您希望每个队列都在一个类中,该类具有访问互斥(用于控制实际添加和删除队列中的项)和“图像可用”信号量(用于表示项目可用)以及实际队列。 This would allow multiple instances of each thread. 这将允许每个线程的多个实例。 Something like this: 像这样的东西:

class imageQueue
{
    std::deque<ImageRead> m_readImages;
    std::mutex            m_changeQueue;
    Semaphore             m_imagesAvailable;

    public:
    bool addImage( ImageRead );
    ImageRead getNextImage();

}

addImage() takes the m_changeQueue mutex, adds the image to m_readImages, then signals m_imagesAvailable; addImage()接受m_changeQueue互斥,将图像添加到m_readImages,然后发出信号m_imagesAvailable;

getNextImage() waits on m_imagesAvailable. getNextImage()等待m_imagesAvailable。 When it becomes signaled, it takes m_changeQueue, removes the next image from the list, and returns it. 当它发出信号时,它需要m_changeQueue,从列表中删除下一个图像,然后返回它。

cf. 比照 http://en.cppreference.com/w/cpp/thread http://en.cppreference.com/w/cpp/thread

Ignoring the question of "Should each operation run in an individual thread", it appears that the objects that you want to process move from thread to thread. 忽略“每个操作应该在单个线程中运行”的问题,看起来您要处理的对象从一个线程移动到另一个线程。 In effect, they are uniquely owned by only one thread at a time (no thread ever needs to access any data from other threads, ). 实际上,它们一次只由一个线程唯一拥有(没有线程需要访问来自其他线程的任何数据)。 There is a way to express just that in C++: std::unique_ptr . 有一种方法可以在C ++中表达它: std::unique_ptr

Each step then only works on its owned image. 然后每个步骤仅适用于其拥有的图像。 All you have to do is find a thread-safe way to move the ownership of your images through the process steps one by one, which means the critical sections are only at the boundaries between tasks. 您所要做的就是找到一种线程安全的方法,通过流程步骤逐个移动图像的所有权,这意味着关键部分只在任务之间的边界。 Since you have multiple of these, abstracting it away would be reasonable: 由于你有多个这些,抽象它是合理的:

class ProcessBoundary
{
public:
  void setImage(std::unique_ptr<ImageRead> newImage)
  {
    while (running)
    {
      {
        std::lock_guard<m_mutex> guard;
        if (m_imageToTransfer == nullptr)
        {
          // Image has been transferred to next step, so we can place this one here.
          m_imageToTransfer = std::move(m_newImage);
          return;
        }
      }
      std::this_thread::yield();
    }
  }

  std::unique_ptr<ImageRead> getImage()
  {
    while (running)
    {
      {
        std::lock_guard<m_mutex> guard;
        if (m_imageToTransfer != nullptr)
        {
          // Image has been transferred to next step, so we can place this one here.
          return std::move(m_imageToTransfer);
        }
      }
      std::this_thread::yield();
    }
  }

  void stop()
  {
    running = false;
  }

private:
  std::mutex m_mutex;
  std::unique_ptr<ImageRead> m_imageToTransfer;
  std::atomic<bool> running; // Set to true in constructor
};

The process steps would then ask for an image with getImage() , which they uniquely own once that function returns. 然后,处理步骤将询问具有getImage()的图像,一旦该函数返回,它们就是唯一拥有的图像。 They process it and pass it to the setImage of the next ProcessBoundary . 他们处理它并将其传递给下一个ProcessBoundarysetImage

You could probably improve on this with condition variables, or adding a queue in this class so that threads can get back to processing the next image. 您可以使用条件变量或在此类中添加队列来改进此功能,以便线程可以返回处理下一个图像。 However, if some steps are faster than others they will necessarily be stalled by the slower ones eventually. 但是,如果某些步骤比其他步骤更快,那么最终它们必然会被较慢的步骤停滞。

This is a design pattern problem. 这是一个设计模式问题。 I suggest to read about concurrency design pattern and see if there is anything that would help you out. 我建议阅读并发设计模式,看看是否有任何可以帮助你的东西。

If you wan to add concurrency to the following sequential process. 如果您想为以下顺序过程添加并发性。

InputStream->Pre-Processing->Computation->OutputStream

Then I suggest to use the active object design pattern. 然后我建议使用活动对象设计模式。 This way each process is not blocked by the previous step and can run concurrently. 这样,每个进程都不会被上一步阻塞,并且可以同时运行。 It is also very simple to implement(Here is an implementation: http://www.drdobbs.com/parallel/prefer-using-active-objects-instead-of-n/225700095 ) 它的实现也非常简单(这是一个实现: http//www.drdobbs.com/parallel/prefer-using-active-objects-instead-of-n/225700095

As to your question about each thread sharing a DTO. 关于每个共享DTO的线程的问题。 This is easily solved with a wrapper on the DTO. 这可以通过DTO上的包装器轻松解决。 The wrapper will contain write and read functions. 包装器将包含写入和读取功能。 The write functions blocks with a mutext and the read returns const data. write函数块使用mutext,read返回const数据。

However, I think your problem lies in design. 但是,我认为你的问题在于设计。 If the process is sequential as you described, then why are each process sharing the data? 如果流程是您所描述的顺序,那么为什么每个流程共享数据? The data should be passed into the next process once the current one completes. 一旦当前的数据完成,数据应该传递到下一个进程。 In other words, each process should be decoupled. 换句话说,每个过程都应该解耦。

You are correct in using mutexes and locks. 您使用互斥锁和锁是正确的。 For C++11, this is really the most elegant way of accessing complex data between threads. 对于C ++ 11,这实际上是在线程之间访问复杂数据的最优雅方式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM