简体   繁体   English

与任务并行相比,并行管道的优势是什么?

[英]What is the advantage of a parallel pipeline compared with Task Parallelism?

I often read about the pipeline pattern as a common and helpful pattern in terms of exploiting concurrency. 我经常读到管道模式作为利用并发性的一种常见且有用的模式。 But I wonder if there is any advantage of the parallel pipeline pattern compared with the Task Parallel Pattern. 但我想知道并行管道模式与任务并行模式相比是否有任何优势。

Suppose we have three stages in a pipeline: A, B, C. When data needs to be processed A takes it, processes it and hands it over to B. When the next data chunk is coming in, the same happens and A and B are working concurrently. 假设我们在一个管道中有三个阶段:A,B,C。当需要处理数据时A接受它,处理它并将其交给B.当下一个数据块进入时,同样的情况发生,A和B同时工作。

So different stages in the pipeline can be executed in parallel, but when we use three pipelines working in parallel (as in Task Parallelism Pattern), we get exactly the same picture. 因此,管道中的不同阶段可以并行执行,但是当我们使用三个并行工作的管道时(如在任务并行模式中),我们得到完全相同的图片。 When two data chunks are coming in one after another, the first chunk is taken by Pipeline 1, the next chunk is taken by Pipeline 2 and both chunks are processed concurrently. 当两个数据块一个接一个地进入时,第一个块由管道1获取,下一个块由管道2获取,并且两个块同时处理。

Furthermore I can easily imagine a lot of problems in parallel Pipeline: The Buffer between the stages could block (or overflow), one stage is dominating in terms of processing speed so all stages before the slowest stage have to wait etc... 此外,我可以很容易地想象并行管道中的许多问题:阶段之间的缓冲区可能会阻塞(或溢出),一个阶段在处理速度方面占主导地位,因此在最慢阶段之前的所有阶段都必须等待......

These problems do not exist in the Task Parallelism Pattern. 任务并行模式中不存在这些问题。 Additionally, this pattern is more flexible when the chunks are coming in faster than the first stage of the pipeline can process them (or they can be fetched concurrently). 此外,当块的进入速度快于管道的第一阶段可以处理它们时(或者它们可以同时获取),这种模式更灵活。

So why should I ever use the parallel pipeline pattern? 那我为什么要使用并行管道模式呢?

Thanks in advance for any ideas! 提前感谢任何想法!

If you have a pipeline A=>B=>C and no further restrictions on it that's indeed useless. 如果你有一个管道A => B => C并且对它没有进一步的限制,这确实是无用的。 You could have just used a function C(B(A(input))) . 您可以使用函数C(B(A(input)))

The concept becomes more useful if you allow different degrees of parallelism at the pipeline stages. 如果在管道阶段允许不同程度的并行性,则该概念会变得更有用。 Maybe step B accesses an SSD and you want at most 4 concurrent accesses. 也许步骤B访问SSD,您最多需要4次并发访问。 You could achieve the same thing with a semaphore. 你可以用信号量实现同样的目的。

If A, B and C are limited to a degree of parallelism of 1 the pipeline also has value: In the pipeline model all 3 nodes can execute concurrently. 如果A,B和C限制为1的并行度,则管道也具有值:在管道模型中,所有3个节点可以同时执行。 Using "three pipelines" as you put it is impossible because of the assumed parallelism limit (or you'd need 3 locks which is equivalent to the pipeline solution). 由于假设的并行性限制(或者您需要3个锁,这相当于管道解决方案),因此使用“三个管道”是不可能的。

Sometimes, you want buffering between the nodes. 有时,您希望节点之间进行缓冲。 Maybe, A rarely emits high bursts that B will process over time. 也许,A很少发出B将随着时间推移处理的高爆发。 Buffering helps keep A working and not stalled. 缓冲有助于保持A工作而不会停滞。

Sometimes, it's not a pipeline but a data flow network that branches in and out (possibly joins). 有时,它不是管道而是分支进出(可能是连接)的数据流网络。

All in all I very rarely find a use case for dataflow networks. 总而言之,我很少找到数据流网络的用例。 Often, it's simpler to just use data parallelism and use appropriate locks and semaphores. 通常,使用数据并行性并使用适当的锁和信号量更简单。 But this might be because of the domains I typically work in. YMMV. 但这可能是因为我经常使用的域名.YMMV。

Pipeline and Task Parallelism are definitely 2 different concepts. 管道任务并行绝对是两个不同的概念。

  • Pipeline : 管道

    Implements Producer-Consumer Pattern . 实现生产者 - 消费者模式 ProcessA gets some data processes and passes to the next one( ProcessB ). ProcessA获取一些数据进程并传递给下一个( ProcessB )。 B can't do anything before A's processing. B在A处理之前无法做任何事情。 Same with B and C etc. There are dependencies among processes. 与B和C等相同。进程之间存在依赖关系。

Ex: Refer this 例如:请参阅

  • Task Parallelism : 任务并行

Simply there's no dependencies. 简直就是没有依赖关系。

Ex: loop-parallels 例如: 循环平行

So, You can't use task Parallelism for dependent tasks. 因此,您不能将任务并行用于依赖任务。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM