简体   繁体   English

akka 流 asyncBoundary 与 mapAsync

[英]akka stream asyncBoundary vs mapAsync

I am trying to understand the difference between asyncBoundary and mapAsync .我试图了解asyncBoundarymapAsync之间的区别。 From the glance, I guess they should be same.乍一看,我猜他们应该是一样的。 However, when I run the code, it looks like that the performance of asyncBoundary is quicker than mapAsync但是,当我运行代码时,看起来asyncBoundary的性能比mapAsync

Here is the code这是代码

implicit val system = ActorSystem("sourceDemo")
implicit val materializer = ActorMaterializer()


Source(1 to 100).mapAsync(100)(t => Future {t + 1}).mapAsync(100)(t => Future {t * 2}).map(println).to(Sink.ignore).run()
Source(1 to 100).map(_ + 1).withAttributes(Attributes.asyncBoundary).map(_ * 2).map(t => println("async boundary", t)).to(Sink.ignore).run()

The output : async boundary is always finished quicker than mayAsync.输出:异步边界总是比 mayAsync 完成得更快。

From the document described about asyncBoundary ( https://doc.akka.io/docs/akka-stream-and-http-experimental/current/scala/stream-flows-and-basics.html ), I can see it is running on different CPU, but mapAsync is multi-threaded by using Future.从描述 asyncBoundary 的文档( https://doc.akka.io/docs/akka-stream-and-http-experimental/current/scala/stream-flows-and-basics.html ),我可以看到它正在运行在不同的 CPU 上,但 mapAsync 通过使用 Future 是多线程的。 Future is also asynchronous. Future 也是异步的。

May I ask more clarification about this two APIs ?我可以问更多关于这两个 API 的说明吗?

Async异步

As you correctly point out this forces the insertion of an asynchronous boundary between two stages.正如您正确指出的那样,这会强制在两个阶段之间插入异步边界。 In your example在你的例子中

Source(1 to 100).map(_ + 1).withAttributes(Attributes.asyncBoundary).map(_ * 2).map(t => println("async boundary", t)).to(Sink.ignore).run()

this practically means that the + 1 operation and the * 2 operation will be run by separated actors.这实际上意味着+ 1操作和* 2操作将由单独的参与者运行。 This enables pipelining, as whilst an element moves on to the * 2 stage, at the same time another element can be brought in for the + 1 stage.这使流水线成为可能,因为当一个元素移动到* 2阶段时,同时可以为+ 1阶段引入另一个元素。 If you don't force an async boundary there, the same actor will sequentialise the operations and will perform the operations on one element, before requesting a new one from upstream.如果您不在那里强制使用异步边界,则同一个 actor 将顺序化操作并对一个元素执行操作,然后再从上游请求一个新元素。

By the way, your example can be rewritten in a shorter format, using the async combinator:顺便说一下,您的示例可以使用async组合器以较短的格式重写:

Source(1 to 100).map(_ + 1).async.map(_ * 2).map(t => println("async boundary", t)).to(Sink.ignore).run()

mapAsync地图异步

This is a stage to parallelise execution of asynchronous operations.这是并行执行异步操作的阶段。 The parallelism factor allows you to specify the maximum number of parallel actors to spin up to serve incoming elements.并行因子允许您指定要旋转以服务传入元素的最大并行 actor 数量。 The results of the parallel computations are tracked and emitted in order by the mapAsync stage.并行计算的结果由mapAsync阶段按顺序跟踪和发出。

In your example在你的例子中

Source(1 to 100).mapAsync(100)(t => Future {t + 1}).mapAsync(100)(t => Future {t * 2}).map(println).to(Sink.ignore).run()

potentially up to 100 + 1 operations (ie all of them) could be run in parallel, and the results collected in order.可能多达 100 + 1操作(即所有操作)可以并行运行,并按顺序收集结果。 Subsequently, up to 100 * 2 operations could be run in parallel, and again the results collected in order and emitted downstream.随后,最多可以并行运行 100 * 2操作,并再次按顺序收集结果并发送到下游。

In your example you are running CPU-bound, quick operations that don't justify using mapAsync , as most likely the infrastructure needed by this stage is much more expensive than the advantage of running 100 of these operations in parallel.在您的示例中,您正在运行不mapAsync使用mapAsync的受 CPU 限制的快速操作,因为此阶段所需的基础设施很可能比并行运行 100 个这些操作的优势要昂贵得多。 mapAsync is particularly useful when dealing with IO-bound, slow operations, where parallelisation is quite convenient. mapAsync在处理 IO 绑定的慢速操作时特别有用,其中并行化非常方便。

For a comprehensive read on this topic, check out this blogpost .有关此主题的全面阅读,请查看此博文

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM