简体   繁体   English

平衡和广播之间的差异扇出Akka Streams

[英]Difference between Balance and Broadcast fan out in Akka Streams

I have a little confusion with fan out strategies in Akka streams , I read that Broadcast – (1 input, N outputs) given an input element emits to each output, while Balance – (1 input, N outputs) given an input element emits to one of its output ports. 我对Akka streams扇出策略有点混淆,我读到Broadcast - (1输入,N输出)给定输入元素发射到每个输出,而Balance - (1输入,N输出)给定输入元素发射到其中一个输出端口。

Can you explain me: 你能解释一下我:

  1. How balance does work with multiple consumers? 如何平衡多个消费者?
  2. Meaning of phrase "emits to one of its output ports" 短语“发送到其输出端口之一”的含义
  3. Does port is same to downstream? 端口是否与下游相同?
  4. Does 'Balance' stand for replication of input stream into a few output partition “Balance”是否表示将输入流复制到一些输出分区中
  5. What does "balance is enabling graphs to be split apart and multiple instances of downstream subscribers replicated to handle the volume" mean? 什么“平衡是使图表分开,下游订户的多个实例复制以处理交易量”是什么意思?

From the documentation... broadcast emits (sends) the element to every consumer. 从文档...广播发布(发送)元素到每个消费者。 balance only emits to the first available consumer. 余额仅发放给第一个可用的消费者。

broadcast 广播

Emit each incoming element each of n outputs. 发出n个输出中的每个输入元素。

balance 平衡

Fan-out the stream to several streams. 扇出流到几个流。 Each upstream element is emitted to the first available downstream consumer. 每个上游元素被发射到第一个可用的下游消费者。

EDIT from comments: 编辑来自评论:

From your gist, you should make two averageCarrierDelay functions, one for each Z and F . 从你的要点出发,你应该制作两个averageCarrierDelay函数,每个函数对应一个ZF Then you can see all the elements sent to each. 然后你可以看到发送给每个元素的所有元素。

val averageCarrierDelayZ =
    Flow[FlightDelayRecord]
      .groupBy(30, _.uniqueCarrier)
        .fold(("", 0, 0)){
          (x: (String, Int, Int), y:FlightDelayRecord) => {
            println(s"Z Received Element: ${y}")
            val count = x._2 + 1
            val totalMins = x._3 + Try(y.arrDelayMins.toInt).getOrElse(0)
            (y.uniqueCarrier, count, totalMins)
          }
        }.mergeSubstreams


val averageCarrierDelayF =
    Flow[FlightDelayRecord]
      .groupBy(30, _.uniqueCarrier)
        .fold(("", 0, 0)){
          (x: (String, Int, Int), y:FlightDelayRecord) => {
            println(s"F Received Element: ${y}")
            val count = x._2 + 1
            val totalMins = x._3 + Try(y.arrDelayMins.toInt).getOrElse(0)
            (y.uniqueCarrier, count, totalMins)
          }
        }.mergeSubstreams

Edit 2: To check things in the future I'd recommend a generic logger for stream stages so you can see what is going on. 编辑2:为了检查未来的事情,我建议使用流阶段的通用记录器,这样你就可以看到发生了什么。

def logElement[A](msg: String) = Flow[A].map { a => println(s"${msg} ${a}"); a }

Doing this allows you to do something like: 这样做可以让你做类似的事情:

D ~> logElement[FlightDelayRecord]("F received: ") ~> F
D ~> logElement[FlightDelayRecord]("Z received: ") ~> Z

This way you can check areas of your graph for strange behavior that you may or may not be expecting. 通过这种方式,您可以检查图表的某些区域是否存在您可能或可能不期望的奇怪行为。

As others have already said, broadcast emits its input to all output ports, while balance emits its input to one output port based on backpressure. 正如其他人已经说过的那样,广播向所有输出端口发出输入,而平衡则根据背压将其输入发送到一个输出端口。

When you use GraphStage , you need to choose which output port you want to use. 使用GraphStage ,需要选择要使用的输出端口。 Consider this example: 考虑这个例子:

val q1 = Source.queue[Int](10, OverflowStrategy.fail)
val q2 = Source.queue[Int](10, OverflowStrategy.fail)
GraphDSL.create(q1, q2)(Keep.both) { implicit b => (input1, input2) =>
  import GraphDSL.Implicits._

  val broadcast = b.add(Broadcast[Int](2))
  val balance = b.add(Balance[Int](2))

  val consumer1, consumer2, consumer3, consumer4 = b.add(Sink.foreach[Int](println))

  input1 ~> broadcast.in
  input2 ~> balance.in

  broadcast.out(0) ~> consumer1
  broadcast.out(1) ~> consumer2

  balance.out(0) ~> consumer3
  balance.out(1) ~> consumer4

  ClosedShape
}

Here we connect one input to a broadcast stage and one to a balance stage. 在这里,我们将一个输入连接到广播阶段,一个连接到平衡阶段。 Then we connect different output ports of the broadcast and balance stages to the respective consumers. 然后我们将广播和平衡阶段的不同输出端口连接到相应的消费者。

In this particular case, when you run the stream, elements coming through the first input will be passed to both consumer1 and consumer2 , because a broadcast stage copies its input to all its outputs (and here are two outputs), and elements coming through the second input will be distributed evenly across consumer3 and consumer4 based on the speed of your terminal (ie the speed of println ), because Sink.foreach backpressures when its function executes for long time. 在这种特殊情况下,当您运行流时,通过第一个输入的元素将被传递给consumer1consumer2 ,因为广播阶段将其输入复制到其所有输出(这里是两个输出),并且元素通过第二个输入将根据终端的速度(即println的速度)在consumer3consumer4上均匀分配,因为Sink.foreach在其功能执行很长时间时会反压。

Note the we have specified that the broadcast and balance stages have 2 ports each (when calling their factory methods), and that we have specified which output port we connect to which consumer. 注意我们已经指定广播和平衡阶段各有2个端口(当调用它们的工厂方法时),并且我们已经指定了哪个输出端口连接到哪个消费者。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM