[英]Scala: joining / waiting for growing queue of futures
I launch several async processes which, in turn, can launch more processes if it's needed (think traversing directory structure or something like that). 我启动了几个异步进程,这些异步进程又可以在需要时启动更多进程(请考虑遍历目录结构或类似的东西)。 Each process returns something, and in the end I want to wait for completion of all of them and schedule a function that will do something with resulting collection.
每个进程都返回一些内容,最后,我想等待所有这些内容完成,并安排一个函数对结果集合进行处理。
My solution attempt used a mutable ListBuffer
(to which I keep adding futures that I spawn), and Future.sequence
to schedule some function to run on completion of all these futures listed in this buffer. 我的解决方案尝试使用可变的
ListBuffer
(我不断添加所生成的期货)和Future.sequence
来调度一些功能,以在此缓冲区中列出的所有这些期货完成时运行。
I've prepared a minimal example that illustrates the issue: 我准备了一个说明问题的最小示例:
object FuturesTest extends App {
var queue = ListBuffer[Future[Int]]()
val f1 = Future {
Thread.sleep(1000)
val f3 = Future {
Thread.sleep(2000)
Console.println(s"f3: 1+2=3 sec; queue = $queue")
3
}
queue += f3
Console.println(s"f1: 1 sec; queue = $queue")
1
}
val f2 = Future {
Thread.sleep(2000)
Console.println(s"f2: 2 sec; queue = $queue")
2
}
queue += f1
queue += f2
Console.println(s"starting; queue = $queue")
Future.sequence(queue).foreach(
(all) => Console.println(s"Future.sequence finished with $all")
)
Thread.sleep(5000) // simulates app being alive later
}
It schedules f1
and f2
futures first, and then f3
will be scheduled in f1
resolution 1 second later. 它首先安排
f1
和f2
期货,然后在1秒后以f1
分辨率安排f3
。 f3
itself will resolve in 2 more seconds. f3
本身将在2秒后解决。 Thus, what I expect to get is the following: 因此,我期望得到以下内容:
starting; queue = ListBuffer(Future(<not completed>), Future(<not completed>))
f1: 1 sec; queue = ListBuffer(Future(<not completed>), Future(<not completed>), Future(<not completed>))
f2: 2 sec; queue = ListBuffer(Future(Success(1)), Future(<not completed>), Future(<not completed>))
f3: 1+2=3 sec; queue = ListBuffer(Future(Success(1)), Future(Success(2)), Future(<not completed>))
Future.sequence finished with ListBuffer(1, 2, 3)
However, I actually get: 但是,我实际上得到了:
starting; queue = ListBuffer(Future(<not completed>), Future(<not completed>))
f1: 1 sec; queue = ListBuffer(Future(<not completed>), Future(<not completed>), Future(<not completed>))
f2: 2 sec; queue = ListBuffer(Future(Success(1)), Future(<not completed>), Future(<not completed>))
Future.sequence finished with ListBuffer(1, 2)
f3: 1+2=3 sec; queue = ListBuffer(Future(Success(1)), Future(Success(2)), Future(<not completed>))
... which is most likely due to the fact that a list of futures that we wait for is fixed during the initial call of Future.sequence
and won't change later. ...这很可能是由于这样的事实,我们等待的期货清单在
Future.sequence
的初始调用期间是固定的,以后不会更改。
Ultimately, I've made it act as I wanted with this code: 最终,通过以下代码,我实现了它的作用:
waitForSequence(queue, (all: ListBuffer[Int]) => Console.println(s"finished with $all"))
def waitForSequence[T](queue: ListBuffer[Future[T]], act: (ListBuffer[T] => Unit)): Unit = {
val seq = Future.sequence(queue)
seq.onComplete {
case Success(res) =>
if (res.size < queue.size) {
Console.println("... still waiting for tasks")
waitForSequence(queue, act)
} else {
act(res)
}
case Failure(exc) =>
throw exc
}
}
This works as intended, getting all 3 futures in the end: 这可以按预期工作,最终获得所有3个期货:
starting; queue = ListBuffer(Future(<not completed>), Future(<not completed>))
f1: 1 sec; queue = ListBuffer(Future(<not completed>), Future(<not completed>), Future(<not completed>))
f2: 2 sec; queue = ListBuffer(Future(Success(1)), Future(<not completed>), Future(<not completed>))
... still waiting for tasks
f3: 1+2=3 sec; queue = ListBuffer(Future(Success(1)), Future(Success(2)), Future(<not completed>))
finished with ListBuffer(1, 2, 3)
But it's still very ugly. 但这仍然非常丑陋。 It just restarts
Future.sequence
waiting if it sees that at time of completion the queue is longer than number of results, hoping that when it completes next time, situation will be better. 如果看到队列完成时长于结果数,它将重新启动
Future.sequence
等待,希望下次完成时情况会更好。 Of course, this is bad because it exhausts stack and it might be error-prone if this check will trigger in a tiny window between creation of a future and appending it to the queue. 当然,这很糟糕,因为它会耗尽堆栈,并且如果此检查将在创建Future和将其添加到队列之间的一个很小的窗口中触发,则可能会容易出错。
Is it possible to do so without rewriting everything with Akka, or resorting to use Await.result
(which I can't actually use due to my code being compiled for Scala.js). 是否可以这样做而不用Akka重写所有内容,或者不使用
Await.result
(由于我的代码是为Scala.js编译的,所以我实际上无法使用它)。
The right way to do this is probably to compose your Futures. 正确的方法可能是组成您的期货。 Specifically, f1 shouldn't just kick off f3, it should probably flatMap over it -- that is, the Future of f1 doesn't resolve until f3 resolves.
具体来说,f1不应只是从f3开始,它可能应该在其上展开flatMap,也就是说,f1的Future直到f3解析后才会解析。
Keep in mind, Future.sequence
is kind of a fallback option, to use only when the Futures are all really disconnected. 请记住,
Future.sequence
是一种后备选项,仅当所有Futures真正断开时才使用。 In a case like you're describing, where there are real dependencies, those are best represented in the Futures you've actually returning. 在您描述的情况下,存在真正的依赖关系时,这些关系最好用您实际返回的期货来表示。 When using Futures, flatMap is your friend, and should be one of the first tools you reach for.
使用Futures时,flatMap是您的朋友,应该是您可以使用的最早工具之一。 (Often but not always as
for
comprehensions.) (通常,但并不总是作为
for
内涵。)
It's probably safe to say that, if you ever want a mutable queue of Futures, the code isn't structured correctly and there's a better way to do it. 可以肯定地说,如果您想要可变的Futures队列,则代码结构不正确,并且有更好的方法来执行此操作。 Specifically in Scala.js (which is where much of my code lies, and which is very Future-heavy), I use for comprehensions over those Futures constantly -- I think it's the only sane way to operate...
特别是在Scala.js中(这是我的大部分代码所在,并且非常重于Future),我经常对这些Future进行理解-我认为这是唯一明智的操作方式...
Like Justin mentioned, you can't lose the reference to the futures spawned inside of the other futures and you should use map and flatMap to chain them. 就像贾斯汀提到的那样,您不能丢失对其他期货中产生的期货的引用,应该使用map和flatMap对其进行链接。
val f1 = Future {
Thread.sleep(1000)
val f3 = Future {
Thread.sleep(2000)
Console.println(s"f3: 1+2=3 sec")
3
}
f3.map{
r =>
Console.println(s"f1: 1 sec;")
Seq(1, r)
}
}.flatMap(identity)
val f2 = Future {
Thread.sleep(2000)
Console.println(s"f2: 2 sec;")
Seq(2)
}
val futures = Seq(f1, f2)
Future.sequence(futures).foreach(
(all) => Console.println(s"Future.sequence finished with ${all.flatten}")
)
Thread.sleep(5000) // simulates app being alive later
This works on the minimal example, I am not sure if it will work for your real use case. 这适用于最小的示例,我不确定它是否适用于您的实际用例。 The result is:
结果是:
f2: 2 sec;
f3: 1+2=3 sec
f1: 1 sec;
Future.sequence finished with List(1, 3, 2)
I would not involve Future.sequence
: it parallelizes the operations, and you seem to be looking for a sequential async execution. 我不会涉及
Future.sequence
:它使操作并行化,并且您似乎正在寻找顺序的异步执行。 Also, you probably don't need the futures to start right away after defining. 此外,您可能不需要在定义后立即开始期货。 The composition should looks something like this:
组成应如下所示:
def run[T](queue: List[() => Future[T]]): Future[List[T]] = {
(Future.successful(List.empty[T]) /: queue)(case (f1, f2) =>
f1() flatMap (h => )
)
val t0 = now
def f(n: Int): () => Future[String] = () => {
println(s"starting $n")
Future[String] {
Thread.sleep(100*n)
s"<<$n/${now - t0}>>"
}
}
println(Await.result(run(f(7)::f(10)::f(20)::f(3)::Nil), 20 seconds))
The trick is not to launch the futures prematurely; 诀窍不在于过早推出期货。 that's why we have
f(n)
that won't start until we call it with ()
. 这就是为什么我们有
f(n)
直到我们用()
调用它才开始的原因。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.