简体   繁体   English

你如何应对Akka Flow中的期货和mapAsync?

[英]How do you deal with futures and mapAsync in Akka Flow?

I built a akka graph DSL defining a simple flow. 我构建了一个定义简单流程的akka​​图形DSL。 But the flow f4 takes 3 seconds to send an element while f2 takes 10 seconds. 但是流f4需要3秒才能发送一个元素,而f2需要10秒。

As a result, I got : 3, 2, 3, 2. But, this is not what I want. 结果,我得到了:3,2,3,2。但是,这不是我想要的。 As f2 takes too much time, I would like to get : 3, 3, 2, 2. Here's the code... 由于f2花费了太多时间,我想得到:3,3,2,2。这是代码......

implicit val actorSystem = ActorSystem("NumberSystem")
implicit val materializer = ActorMaterializer()

val g = RunnableGraph.fromGraph(GraphDSL.create() { implicit builder: GraphDSL.Builder[NotUsed] =>
  import GraphDSL.Implicits._
  val in = Source(List(1, 1))
  val out = Sink.foreach(println)

  val bcast = builder.add(Broadcast[Int](2))
  val merge = builder.add(Merge[Int](2))



  val yourMapper: Int => Future[Int] = (i: Int) => Future(i + 1)
  val yourMapper2: Int => Future[Int] = (i: Int) => Future(i + 2)

  val f1, f3 = Flow[Int]
  val f2= Flow[Int].throttle(1, 10.second, 0, ThrottleMode.Shaping).mapAsync[Int](2)(yourMapper)
  val f4= Flow[Int].throttle(1, 3.second, 0, ThrottleMode.Shaping).mapAsync[Int](2)(yourMapper2)

  in ~> f1 ~> bcast ~> f2 ~> merge ~> f3 ~> out
  bcast ~> f4 ~> merge
  ClosedShape
})
g.run()

So where am I going wrong ? 那我哪里错了? With future or mapAsync ? 使用future或mapAsync? or else ... Thanks 或者......谢谢

Sorry I'm new in akka, so I'm still learning. 对不起,我是akka的新人,所以我还在学习。 To get the expected results, one way is to put async : 要获得预期的结果,一种方法是将异步:

val g = RunnableGraph.fromGraph(GraphDSL.create() { implicit builder: GraphDSL.Builder[NotUsed] =>
  import GraphDSL.Implicits._
  val in = Source(List(1, 1))
  val out = Sink.foreach(println)

  val bcast = builder.add(Broadcast[Int](2))
  val merge = builder.add(Merge[Int](2))



  val yourMapper: Int => Future[Int] = (i: Int) => Future(i + 1)
  val yourMapper2: Int => Future[Int] = (i: Int) => Future(i + 2)

  val f1, f3 = Flow[Int]
  val f2= Flow[Int].throttle(1, 10.second, 0, ThrottleMode.Shaping).map(_+1)
    //.mapAsyncUnordered[Int](2)(yourMapper)
  val f4= Flow[Int].throttle(1, 3.second, 0, ThrottleMode.Shaping).map(_+2)
    //.mapAsync[Int](2)(yourMapper2)

  in ~> f1 ~> bcast ~> f2.async ~> merge ~> f3 ~> out
  bcast ~> f4.async ~> merge
  ClosedShape
})
g.run()

As you've already figured out, replacing: 正如您已经想到的那样,替换:

mapAsync(i => Future{i + delta})

with: 有:

map(_ + delta).async

in the two flows would achieve what you want. 在这两个流程中将实现你想要的。

The different result boils down to the key difference between mapAsync and map + async . 不同的结果归结为mapAsyncmap + async之间的关键区别。 While mapAsync enables execution of Futures in parallel threads, the multiple mapAsync flow stages are still being managed by the same underlying actor which would carry out operator fusion before execution (for cost efficiency in general). 虽然mapAsync允许在并行线程中执行Futures,但是多个mapAsync流阶段仍然由相同的底层actor执行,这将在执行之前执行运算符融合 (通常为了成本效率)。

On the other hand, async actually introduces an asynchronous boundary into the stream flow with the individual flow stages handled by separate actors. 另一方面, async实际上在流流中引入了异步边界,各个流阶段由不同的actor处理。 In your case, each of the two flow stages independently emits elements downstream and whichever element emitted first gets consumed first. 在您的情况下,两个流动阶段中的每一个独立地向下游发射元素,并且首先消耗的元素首先被消耗。 Inevitably there is a cost for managing the stream across the asynchronous boundary and Akka Stream uses a windowed buffering strategy to amortize the cost (see this Akka Stream doc ). 不可避免地需要跨异步边界管理流的成本,Akka Stream使用窗口缓冲策略来摊销成本(参见Akka Stream doc )。

For more details re: difference between mapAsync and async , this blog post might be of interest. 有关详细信息: mapAsyncasync之间的mapAsync ,此博客文章可能会引起关注。

So you are trying to join together the results coming out of f2 and f4. 所以你试图将f2和f4的结果连接在一起。 In which case you're trying to do what is sometimes called "scatter gather pattern". 在这种情况下,您正在尝试执行有时称为“分散聚集模式”的操作。

I don't think there are off the shelf ways to implement it, without adding a custom stateful stage that will keep track of outputs from f2 and from f4 and emit a record when both are available. 我不认为有现成的方法来实现它,没有添加一个自定义的有状态阶段,将跟踪f2和f4的输出,并在两者都可用时发出记录。 But they are some complications to bear in mind: 但是要记住它们是一些复杂问题:

  • What happens if a f2/f4 fails 如果f2 / f4失败会发生什么
  • What happens if they take too long 如果他们花了太长时间会发生什么
  • You need to have unique key for each input record, so you know which output from f2 correspond to f4 (or vice versa) 您需要为每个输入记录设置唯一键,以便知道f2的哪个输出对应于f4(反之亦然)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM