简体   繁体   English

Scala:加入/等待越来越多的期货队列

[英]Scala: joining / waiting for growing queue of futures

I launch several async processes which, in turn, can launch more processes if it's needed (think traversing directory structure or something like that). 我启动了几个异步进程,这些异步进程又可以在需要时启动更多进程(请考虑遍历目录结构或类似的东西)。 Each process returns something, and in the end I want to wait for completion of all of them and schedule a function that will do something with resulting collection. 每个进程都返回一些内容,最后,我想等待所有这些内容完成,并安排一个函数对结果集合进行处理。

Naïve attempt 天真的尝试

My solution attempt used a mutable ListBuffer (to which I keep adding futures that I spawn), and Future.sequence to schedule some function to run on completion of all these futures listed in this buffer. 我的解决方案尝试使用可变的ListBuffer (我不断添加所生成的期货)和Future.sequence来调度一些功能,以在此缓冲区中列出的所有这些期货完成时运行。

I've prepared a minimal example that illustrates the issue: 我准备了一个说明问题的最小示例:

object FuturesTest extends App {
  var queue = ListBuffer[Future[Int]]()

  val f1 = Future {
    Thread.sleep(1000)
    val f3 = Future {
      Thread.sleep(2000)
      Console.println(s"f3: 1+2=3 sec; queue = $queue")
      3
    }
    queue += f3
    Console.println(s"f1: 1 sec; queue = $queue")
    1
  }
  val f2 = Future {
    Thread.sleep(2000)
    Console.println(s"f2: 2 sec; queue = $queue")
    2
  }

  queue += f1
  queue += f2
  Console.println(s"starting; queue = $queue")

  Future.sequence(queue).foreach(
    (all) => Console.println(s"Future.sequence finished with $all")
  )

  Thread.sleep(5000) // simulates app being alive later
}

It schedules f1 and f2 futures first, and then f3 will be scheduled in f1 resolution 1 second later. 它首先安排f1f2期货,然后在1秒后以f1分辨率安排f3 f3 itself will resolve in 2 more seconds. f3本身将在2秒后解决。 Thus, what I expect to get is the following: 因此,我期望得到以下内容:

starting; queue = ListBuffer(Future(<not completed>), Future(<not completed>))
f1: 1 sec; queue = ListBuffer(Future(<not completed>), Future(<not completed>), Future(<not completed>))
f2: 2 sec; queue = ListBuffer(Future(Success(1)), Future(<not completed>), Future(<not completed>))
f3: 1+2=3 sec; queue = ListBuffer(Future(Success(1)), Future(Success(2)), Future(<not completed>))
Future.sequence finished with ListBuffer(1, 2, 3)

However, I actually get: 但是,我实际上得到了:

starting; queue = ListBuffer(Future(<not completed>), Future(<not completed>))
f1: 1 sec; queue = ListBuffer(Future(<not completed>), Future(<not completed>), Future(<not completed>))
f2: 2 sec; queue = ListBuffer(Future(Success(1)), Future(<not completed>), Future(<not completed>))
Future.sequence finished with ListBuffer(1, 2)
f3: 1+2=3 sec; queue = ListBuffer(Future(Success(1)), Future(Success(2)), Future(<not completed>))

... which is most likely due to the fact that a list of futures that we wait for is fixed during the initial call of Future.sequence and won't change later. ...这很可能是由于这样的事实,我们等待的期货清单在Future.sequence的初始调用期间是固定的,以后不会更改。

Working, but ugly attempt 工作,但是很难尝试

Ultimately, I've made it act as I wanted with this code: 最终,通过以下代码,我实现了它的作用:

  waitForSequence(queue, (all: ListBuffer[Int]) => Console.println(s"finished with $all"))

  def waitForSequence[T](queue: ListBuffer[Future[T]], act: (ListBuffer[T] => Unit)): Unit = {
    val seq = Future.sequence(queue)
    seq.onComplete {
      case Success(res) =>
        if (res.size < queue.size) {
          Console.println("... still waiting for tasks")
          waitForSequence(queue, act)
        } else {
          act(res)
        }
      case Failure(exc) =>
        throw exc
    }
  }

This works as intended, getting all 3 futures in the end: 这可以按预期工作,最终获得所有3个期货:

starting; queue = ListBuffer(Future(<not completed>), Future(<not completed>))
f1: 1 sec; queue = ListBuffer(Future(<not completed>), Future(<not completed>), Future(<not completed>))
f2: 2 sec; queue = ListBuffer(Future(Success(1)), Future(<not completed>), Future(<not completed>))
... still waiting for tasks
f3: 1+2=3 sec; queue = ListBuffer(Future(Success(1)), Future(Success(2)), Future(<not completed>))
finished with ListBuffer(1, 2, 3)

But it's still very ugly. 但这仍然非常丑陋。 It just restarts Future.sequence waiting if it sees that at time of completion the queue is longer than number of results, hoping that when it completes next time, situation will be better. 如果看到队列完成时长于结果数,它将重新启动Future.sequence等待,希望下次完成时情况会更好。 Of course, this is bad because it exhausts stack and it might be error-prone if this check will trigger in a tiny window between creation of a future and appending it to the queue. 当然,这很糟糕,因为它会耗尽堆栈,并且如果此检查将在创建Future和将其添加到队列之间的一个很小的窗口中触发,则可能会容易出错。


Is it possible to do so without rewriting everything with Akka, or resorting to use Await.result (which I can't actually use due to my code being compiled for Scala.js). 是否可以这样做而不用Akka重写所有内容,或者不使用Await.result (由于我的代码是为Scala.js编译的,所以我实际上无法使用它)。

The right way to do this is probably to compose your Futures. 正确的方法可能是组成您的期货。 Specifically, f1 shouldn't just kick off f3, it should probably flatMap over it -- that is, the Future of f1 doesn't resolve until f3 resolves. 具体来说,f1不应只是从f3开始,它可能应该在其上展开flatMap,也就是说,f1的Future直到f3解析后才会解析。

Keep in mind, Future.sequence is kind of a fallback option, to use only when the Futures are all really disconnected. 请记住, Future.sequence是一种后备选项,仅当所有Futures真正断开时才使用。 In a case like you're describing, where there are real dependencies, those are best represented in the Futures you've actually returning. 在您描述的情况下,存在真正的依赖关系时,这些关系最好用您实际返回的期货来表示。 When using Futures, flatMap is your friend, and should be one of the first tools you reach for. 使用Futures时,flatMap是您的朋友,应该是您可以使用的最早工具之一。 (Often but not always as for comprehensions.) (通常,但并不总是作为for内涵。)

It's probably safe to say that, if you ever want a mutable queue of Futures, the code isn't structured correctly and there's a better way to do it. 可以肯定地说,如果您想要可变的Futures队列,则代码结构不正确,并且有更好的方法来执行此操作。 Specifically in Scala.js (which is where much of my code lies, and which is very Future-heavy), I use for comprehensions over those Futures constantly -- I think it's the only sane way to operate... 特别是在Scala.js中(这是我的大部分代码所在,并且非常重于Future),我经常对这些Future进行理解-我认为这是唯一明智的操作方式...

Like Justin mentioned, you can't lose the reference to the futures spawned inside of the other futures and you should use map and flatMap to chain them. 就像贾斯汀提到的那样,您不能丢失对其他期货中产生的期货的引用,应该使用map和flatMap对其进行链接。

val f1 = Future {
  Thread.sleep(1000)
  val f3 = Future {
    Thread.sleep(2000)
    Console.println(s"f3: 1+2=3 sec")
    3
  }
  f3.map{
    r =>
      Console.println(s"f1: 1 sec;")
      Seq(1, r)
  }
}.flatMap(identity)

val f2 = Future {
  Thread.sleep(2000)
  Console.println(s"f2: 2 sec;")
  Seq(2)
}

val futures = Seq(f1, f2)

Future.sequence(futures).foreach(
  (all) => Console.println(s"Future.sequence finished with ${all.flatten}")
)

Thread.sleep(5000) // simulates app being alive later

This works on the minimal example, I am not sure if it will work for your real use case. 这适用于最小的示例,我不确定它是否适用于您的实际用例。 The result is: 结果是:

f2: 2 sec;
f3: 1+2=3 sec
f1: 1 sec;
Future.sequence finished with List(1, 3, 2)

I would not involve Future.sequence : it parallelizes the operations, and you seem to be looking for a sequential async execution. 我不会涉及Future.sequence :它使操作并行化,并且您似乎正在寻找顺序的异步执行。 Also, you probably don't need the futures to start right away after defining. 此外,您可能不需要在定义后立即开始期货。 The composition should looks something like this: 组成应如下所示:

def run[T](queue: List[() => Future[T]]): Future[List[T]] = {
  (Future.successful(List.empty[T]) /: queue)(case (f1, f2) =>
  f1() flatMap (h => )
  )

val t0 = now

def f(n: Int): () => Future[String] = () => {
  println(s"starting $n")
  Future[String] {
    Thread.sleep(100*n)
    s"<<$n/${now - t0}>>"
  }
}

println(Await.result(run(f(7)::f(10)::f(20)::f(3)::Nil), 20 seconds))

The trick is not to launch the futures prematurely; 诀窍不在于过早推出期货。 that's why we have f(n) that won't start until we call it with () . 这就是为什么我们有f(n)直到我们用()调用它才开始的原因。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM