简体   繁体   English

C ++:多线程设计,每个线程都应同时执行I / O和CPU密集型任务

[英]C++: Multi-thread design when each thread is supposed to do both I/O and CPU intensive task

I have a situation where I am offloading my work to threads. 我遇到了将工作分担给线程的情况。 The "work" compromises of two portions: “工作”包括两个部分:

  • First compress the given data buffer 首先压缩给定的数据缓冲区
  • Then write the compressed data to disk 然后将压缩数据写入磁盘

My main thread is continuously creating many data buffers. 我的主线程正在不断创建许多数据缓冲区。

I was initially thinking of a thread pool design, but then there could be a possibility that all my threads in the pool are waiting on I/O. 我最初考虑的是线程池设计,但是然后可能池中的所有线程都在等待I / O。

If I create a new thread whenever I create a new dataBuffer, I see that a large number of threads get created. 如果在创建新的dataBuffer时创建新线程,则会看到创建了大量线程。 This can then have overhead of content switching, but because of the context switch my CPU cycles are not getting wasted. 这样会带来内容切换的开销,但是由于上下文切换,我的CPU周期没有浪费。

What can be a good design to manage this situations? 有什么好的设计可以应对这种情况?

Let me try if i could help for this. 让我尝试一下,如果我能帮上忙。

1. First compress the given data buffer
2. Then write the compressed data to disk

What i understand from you is you have data buffer generated, which you need to compress and store into the disk. 我从您那里了解到,您已经生成了数据缓冲区,您需要将其压缩并存储到磁盘中。

If order matters and source of data is not time intensive that it will not loose the data till the next cycle, then you could have the below approach. 如果订单很重要,并且数据源不是时间密集型的,直到下一个周期它都不会丢失数据,那么您可以采用以下方法。

Thread A 线程A

Generate a data buffer 生成数据缓冲区

Signal to Thread B saying you have a data. 向线程B发出信号,说您有数据。

Thread B 螺纹B

Wait for the signal from Thread A 等待来自线程A的信号

Retrieve the data and compress. 检索数据并压缩。

Signal to Thread C to store it. 发信号给线程C进行存储。

Thread C 螺纹C

Wait for the signal from Thread B 等待来自线程B的信号

Retrieve compressed data, and store into the disk. 检索压缩的数据,并将其存储到磁盘中。

Another useful and highly efficient design pattern, is to have a pool of threads all pulling from a single queue of tasks. 另一个有用且高效的设计模式是拥有一个线程池,所有线程都从单个任务队列中提取。 Each task, upon completion, generates a new task and pushes it to the queue 每个任务完成后都会生成一个新任务,并将其推送到队列中

The data generation task, upon completion, generates the compression task. 数据生成任务完成后将生成压缩任务。 The compression task, upon completion, generates the storage task. 压缩任务完成后将生成存储任务。

Now, if you want all storage tasks to happen sequentially, use a separate queue for those, and have just one dedicated thread pulling tasks from that queue. 现在,如果您希望所有存储任务按顺序执行,请为它们使用一个单独的队列,并且只有一个专用线程从该队列中提取任务。

The advantage is that this creates a very clean and general design, in which direct message passing is avoided, and instead a concurrent queue provides the reliability, and there's minimal context switching. 这样做的好处是,它创建了一个非常干净和通用的设计,避免了直接消息传递,而并发队列提供了可靠性,并且上下文切换最少。 It is highly scalable, because it will always make use of as many threads as you have in the pool. 它具有高度的可扩展性,因为它将始终使用与池中一样多的线程。 This is optimal in case you don't have any order constraints (such as "buffer #n must be written to disk before buffer #(n+1)"). 如果您没有任何顺序限制(例如“必须在缓冲区#(n + 1)之前将缓冲区#n写入磁盘”),这是最佳选择。

I think you can use 2 semaphore variables to sync the worker thread. 我认为您可以使用2个信号量变量来同步工作线程。 For example the variables is: 例如,变量为:

  • compressSemaphore: compressSemaphore:
  • storeSemaphore: storeSemaphore:

And you also need 2 queue to store the data: 而且您还需要2个队列来存储数据:

  • toCompressData : toCompressData
  • toStoreData : toStoreData

Your main thread is creating many data buffers, and then put the data buffer into toCompressData and signal the compressSemaphore . 您的主线程正在创建许多数据缓冲区,然后将数据缓冲区放入toCompressData中并向compressSemaphore发出信号。

Your compress worker thread is wait for compressSemaphore , and start to get data from toCompressData if get a signal from main thread. 您的compress工作线程正在等待compressSemaphore ,如果从主线程获得信号,则开始从toCompressData获取数据。 The compress thread will put the compressed data into toStoreData , and signal storeSemaphore . 压缩线程会将压缩后的数据放入toStoreData中 ,并发出storeSemaphore信号。

Your store worker thread is wait for storeSemaphore , and start to get data from toStoreData if the semaphore is signal. 您的商店工作线程正在等待storeSemaphore ,如果信号量为信号,则开始从toStoreData获取数据。

You can set fixed number of compress worker thread or store thread, or can adjust the number dynamically. 您可以设置压缩工作线程或存储线程的固定数量,也可以动态调整数量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM