简体   繁体   English

Anorm(Play Scala)中流媒体支持的优势是什么?

[英]What's the advantage of the streaming support in Anorm (Play Scala)?

I have been reading the section Streaming results in Play docs. 我一直在阅读播放文档中的Streaming结果部分。 What I expected to find is a way to create a Scala Stream based on the results, so if I create a run that returns 10,000 rows that need to be parsed, it would parse them in batches (eg 100 at a time) or would just parse the first one and parse the rest as they're needed (so, a Stream ). 我期望找到一种基于结果创建Scala Stream方法,所以如果我创建一个返回需要解析的10,000行的运行,它会批量解析它们(例如一次100个)或者只是解析第一个并在需要时解析其余的(所以,一个Stream )。

What I found (from my understanding, I might be completely wrong) is basically a way to parse the results one by one, but at the end it creates a list with all the parsed results (with an arbitrary limit if you like, in this case 100 books ). 我发现(根据我的理解,我可能完全错误)基本上是一种逐个解析结果的方法,但最后它创建了一个包含所有已解析结果的列表(如果你愿意,可以使用任意限制,在此案例100本书 )。 Let's take this example from the docs: 让我们从文档中拿这个例子:

val books: Either[List[Throwable], List[String]] = 
  SQL("Select name from Books").foldWhile(List[String]()) { (list, row) => 
    if (list.size == 100) (list -> false) // stop with `list`
    else (list := row[String]("name")) -> true // continue with one more name
  }

What advantages does that provide over a basic implementation such as: 与基本实现相比,它提供了哪些优势:

val books: List[String] = SQL("Select name from Books").as(str("name"))  // please ignore possible syntax errors, hopefully understandable still

Parsing a very large number of rows is just inefficient. 解析大量行只是效率低下。 You might not see it as easily for a simple class, but when you start adding a few joins and have a more complex parser, you will start to see a huge performance hit when the row count gets into the thousands. 对于一个简单的类,您可能不会轻易看到它,但是当您开始添加一些连接并具有更复杂的解析器时,当行计数达到数千时,您将开始看到巨大的性能损失。

From my personal experience, queries that return 5,000 - 10,000 rows (and more) that the parser tries to handle all at once consume so much CPU time that the program effectively hangs indefinitely. 根据我的个人经验,返回5,000 - 10,000行(以及更多)解析器尝试同时处理的查询消耗了大量CPU时间,以至于程序无限期地有效挂起。

Streaming avoids the problem of trying to parse everything all at once, or even waiting for all the results to make it back to the server over the wire. 流式传输避免了尝试一次性解析所有内容,甚至等待所有结果通过网络将其恢复到服务器的问题。

What I would suggest is using the Anorm query result as a Source with Akka Streams . 我建议使用Anorm查询结果作为Akka StreamsSource I successfully stream that way hundreds of thousands rows. 我成功地以数十万行的方式流式传输。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM