使用关联和可交换运算符折叠/减少期货清单

Question

Consider the following: 考虑以下：

import scala.concurrent._
import scala.concurrent.duration.Duration.Inf
import scala.concurrent.ExecutionContext.Implicits.global

def slowInt(i: Int) = { Thread.sleep(200); i }
def slowAdd(x: Int, y: Int) = { Thread.sleep(100); x + y }
def futures = (1 to 20).map(i => future(slowInt(i)))

def timeFuture(fn: => Future[_]) = {
  val t0 = System.currentTimeMillis
  Await.result(fn, Inf)
  println((System.currentTimeMillis - t0) / 1000.0 + "s")
}

both of the following print ~2.5s: 以下两个print~2.5s：

// Use Future.reduce directly (Future.traverse is no different)
timeFuture { Future.reduce(futures)(slowAdd) }

// First wait for all results to come in, convert to Future[List], and then map the List[Int]
timeFuture { Future.sequence(futures).map(_.reduce(slowAdd)) }

As far as I can understand, the reason for this is that Future.reduce/traverse is generic and therefore does not run faster with an associative operator, however, is there an easy way to define a computation where the folding/reducing would start as soon as at least 2 values are available (or 1 in the case of fold ), so that while some items in the list are still being generated, the already generated ones are already being computed on? 据我所知，其原因是Future.reduce/traverse是通用的，因此使用关联运算符不会运行得更快，但是，有一种简单的方法可以定义折叠/缩减开始的计算很快就会有至少2个值（或者在fold的情况下为1），这样当列表中的某些项目仍在生成时，已经生成的项目已经在计算中了？

Answer 1

Scalaz has an implementation of futures that includes a chooseAny combinator that takes a collection of futures and returns a future of a tuple of the first completed element and the rest of the futures: Scalaz有一个期货的实现，其中包括一个chooseAny组合器，它收集期货的集合并返回第一个完成元素和期货其余部分的元组的未来：

def chooseAny[A](h: Future[A], t: Seq[Future[A]]): Future[(A, Seq[Future[A]])]

Twitter's implementation of futures calls this select . Twitter的期货实施称之为select 。 The standard library doesn't include it (but see Viktor Klang's implementation pointed out by Som Snytt above). 标准库不包含它（但请参阅Som Snytt上面指出的Viktor Klang的实现）。 I'll use Scalaz's version here, but translation should be straightforward. 我将在这里使用Scalaz的版本，但翻译应该是直截了当的。

One approach to getting the operations to run as you wish is to pull two completed items off the list, push a future of their sum back on the list, and recurse (see this gist for a complete working example): 使操作按照您的意愿运行的一种方法是从列表中提取两个已完成的项目，将其未来的总和推回到列表中，并递归（请参阅此要点以获取完整的工作示例）：

def collapse[A](fs: Seq[Future[A]])(implicit M: Monoid[A]): Future[A] =
  Nondeterminism[Future].chooseAny(fs).fold(Future.now(M.zero))(
    _.flatMap {
      case (hv, tf) =>
        Nondeterminism[Future].chooseAny(tf).fold(Future.now(hv))(
          _.flatMap {
            case (hv2, tf2) => collapse(Future(hv |+| hv2) +: tf2)
          }
        )
    }
  )

In your case you'd call something like this: 在你的情况下你会打这样的话：

timeFuture(
  collapse(futures)(
    Monoid.instance[Int]((a, b) => slowAdd(a, b), 0)
  )
)

This runs in just a touch over 1.6 seconds on my dual core laptop, so it's working as expected (and will continue to do what you want even if the time taken by slowInt varies). 这在我的双核笔记本电脑上只需1.6秒即可完成，所以它按预期工作（即使slowInt所用的时间不同，也会继续做你想做的slowInt ）。

Answer 2

To get similar timings to you, I had to use a local ExecutionContext like( from here ): 为了得到类似的时间，我必须使用本地的ExecutionContext（从这里）：

implicit val ec = ExecutionContext.fromExecutor(Executors.newCachedThreadPool())

After that, I got better performance by splitting up the list and starting the work on each list by assigning them to vals(based on remembering that futures in a for-comprehenion are handled in order unless they are assigned to vals before the for-comprehenion). 在那之后，我通过将列表分配并在每个列表上开始工作来获得更好的性能，将它们分配给vals（基于记住for-comprehenion中的未来按顺序处理，除非它们在for-comprehenion之前被分配给vals ）。 Because of the associative nature of the lists, I could then re-combine them with one more call to the same function. 由于列表的关联性质，我可以将它们与对同一函数的一次调用重新组合。 I modified the timeFuture function to take a description and print the result of the addition: 我修改了timeFuture函数来获取描述并打印添加的结果：

def timeFuture(desc: String, fn: => Future[_]) = {
  val t0 = System.currentTimeMillis
  val res = Await.result(fn, Inf)
  println(desc + " = " + res + " in " + (System.currentTimeMillis - t0) / 1000.0 + "s")
}

I'm new to Scala, so I'm still working out re-using the same function at the last step(I think it should be possible) so I cheated and created a helper function: 我是Scala的新手，所以我仍在努力在最后一步重复使用相同的功能（我认为它应该是可能的）所以我作弊并创建了一个辅助函数：

def futureSlowAdd(x: Int, y: Int) = future(slowAdd(x, y))

Then I could do the following: 然后我可以做以下事情：

timeFuture( "reduce", { Future.reduce(futures)(slowAdd) } )

val right = Future.reduce(futures.take(10))(slowAdd)
val left = Future.reduce(futures.takeRight(10))(slowAdd)
timeFuture( "split futures", (right zip left) flatMap (futureSlowAdd _).tupled)

With that last zip etc from here . 有了最后一个拉链等来自这里。

I think this is parallel-izing the work and recombining the results. 我认为这是平行工作并重新组合结果。 When I run the those, I get: 当我运行那些时，我得到：

reduce = 210 in 2.111s
split futures = 210 in 1.201s

I've used a hard-coded pair of takes, but my idea is the whole list splitting could be put into a function and actually re-use the associative function handed to both right and left branches( with slightly unbalanced trees allowed due to remainders ) at the end. 我使用了一对硬编码的拍摄，但我的想法是整个列表拆分可以放入一个函数，并实际重新使用传递给左右分支的关联函数（由于余数允许略微不平衡的树））在末尾。

When I randomize the slowInt() and slowAdd() functions like: 当我随机化slowInt()和slowAdd()函数时：

def rand(): Int = Random.nextInt(3)+1
def slowInt(i: Int) = { Thread.sleep(rand()*100); i }
def slowAdd(x: Int, y: Int) = { Thread.sleep(rand()*100); x + y }

I still see "split futures" completing sooner than "reduce". 我仍然看到“拆分期货”比“减少”更早完成。 There appears to be some overhead to starting up, that affects the first timeFuture call. 启动似乎有一些开销，这会影响第一次timeFuture调用。 Here's a few examples of running them with the startup penalty over "split futures": 以下是一些运行它们的示例，启动惩罚超过“拆分期货”：

split futures = 210 in 2.299s
reduce = 210 in 4.7s

split futures = 210 in 2.594s
reduce = 210 in 3.5s

split futures = 210 in 2.399s
reduce = 210 in 4.401s

On a faster computer than my laptop and using the same ExecutionContext in the question I don't see such large differences(without the randomization in the slow* functions): 在比我的笔记本电脑更快的计算机上并在问题中使用相同的ExecutionContext我没有看到如此大的差异（没有慢*函数中的随机化）：

split futures = 210 in 2.196s
reduce = 210 in 2.5s

Here it looks like the "split futures" only leads by a little bit. 在这看起来“分裂期货”只是略微领先。

One last go. 最后一次。 Here's a function (aka abomination) that extends the idea I had above: 这是一个扩展我上面的想法的函数（又称憎恶）：

def splitList[A <: Any]( f: List[Future[A]], assocFn: (A, A) => A): Future[A] = {
    def applyAssocFn( x: Future[A], y: Future[A]): Future[A] = {
      (x zip y) flatMap( { case (a,b) => future(assocFn(a, b)) } )
    }
    def divideAndConquer( right: List[Future[A]], left: List[Future[A]]): Future[A] = {
      (right, left) match {
        case(r::Nil, Nil) => r
        case(Nil, l::Nil) => l
        case(r::Nil, l::Nil) => applyAssocFn( r, l )
        case(r::Nil, l::ls) => {
          val (l_right, l_left) = ls.splitAt(ls.size/2)
          val lret = applyAssocFn( l, divideAndConquer( l_right, l_left ) )
          applyAssocFn( r, lret )
        }
        case(r::rs, l::Nil) => {
          val (r_right, r_left) = rs.splitAt(rs.size/2)
          val rret = applyAssocFn( r, divideAndConquer( r_right, r_left ) )
          applyAssocFn( rret, l )
        }
        case (r::rs, l::ls) => {
          val (r_right, r_left) = rs.splitAt(rs.size/2)
          val (l_right, l_left) = ls.splitAt(ls.size/2)
          val tails = applyAssocFn(divideAndConquer( r_right, r_left ), divideAndConquer( l_right, l_left ))
          val heads = applyAssocFn(r, l)
          applyAssocFn( heads, tails )
        }
      }
    }
    val( right, left ) = f.splitAt(f.size/2)
    divideAndConquer( right, left )
  }

It takes all the pretty out of Scala to split the list up non-tail recursively and assign the futures to values as soon as possible(to start them). 它需要Scala中的所有内容以递归方式将列表拆分为非尾部，并尽快将期货分配给值（启动它们）。

When I test it like: 当我测试它时：

timeFuture( "splitList", splitList( futures.toList, slowAdd) )

I get the following timings on my laptop using the newCachedThreadPool() : 我使用newCachedThreadPool()在笔记本电脑上获得以下时间：

splitList = 210 in 0.805s
split futures = 210 in 1.202s
reduce = 210 in 2.105s

I noticed that the "split futures" timings could be invalid because the futures are started outside of the timeFutures block. 我注意到“拆分期货”时间可能无效，因为期货是在timeFutures区块之外启动的。 However, the splitList function should be called correctly inside the timeFutures function. 然而， splitList功能应该正确内部称为timeFutures功能。 One take-away for me is the importance of picking an ExecutionContext that's best for the hardware. 对我而言，最重要的是选择最适合硬件的ExecutionContext。

Answer 3

The answer below will run in 700ms on a 20 core machine, which given what needs to run in sequence is as well as one can do on any machine with any implementation (20 parallel 200ms slowInt calls followed by 5 nested 100ms slowAdd calls). 下面的答案将在一个20核的机器上运行700毫秒，这给出了需要按顺序运行的内容，以及可以在任何具有任何实现的机器上执行的操作（20个并行的200ms slowInt调用，然后是5个嵌套的100ms slowAdd调用）。 It runs in 1600ms on my 4 core machine, which is as well as one can do on that machine. 它在我的4核机器上运行1600毫秒，这也可以在那台机器上运行。

When the slowAdd calls are expanded, with f representing slowAdd : 当扩展slowAdd调用时， f代表slowAdd ：

f(f(f(f(f(x1, x2), f(x3, x4)), f(f(x5, x6), f(x7, x8))), f(f(f(x9, x10), f(x11, x12)), f(f(x13, x14), f(x15, x16)))), f(f(x17, x18), f(x19, x20)))

The example you provided that uses Future.sequence will run in 2100ms on a 20 core machine (20 parallel 200ms slowInt calls followed by 19 nested 100ms slowAdd calls). 您提供的使用Future.sequence将在20核计算机上运行2100ms（20个并行200ms slowInt调用，然后是19个嵌套的100ms slowAdd调用）。 It runs in 2900ms on my 4 core machine. 它在我的4核机器上运行2900ms。

When the slowAdd calls are expanded, with f representing slowAdd : 当扩展slowAdd调用时， f代表slowAdd ：

f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(f(x1, x2), x3), x4), x5), x6), x7), x8), x9), x10), x11), x12), x13), x14), x15) x16) x17) x18) x19) x20)

The Future.reduce method calls Future.sequence(futures).map(_ reduceLeft op) so the two examples you provided are equivalent. Future.reduce方法调用Future.sequence(futures).map(_ reduceLeft op)因此您提供的两个示例是等效的。

My answer uses a combine function that takes a list of futures and an op , a function that combines two futures into one as parameters. 我的答案使用了一个combine函数，它接受一个期货列表和一个op ，一个将两个期货combine函数作为参数。 The function returns the op applied to all pairs of futures and pairs of pairs and so on until one future remains, which is returned: 该函数返回的op ，直到一个未来仍然存在，这将返回适用于所有对期货和对对等的：

def combine[T](list: List[Future[T]], op: (Future[T], Future[T]) => Future[T]): Future[T] =
  if (list.size == 1) list.head
  else if(list.size == 2) list.reduce(op)
  else list.grouped(2).map(combine(_, op)).reduce(op)

Note: I modified your code a bit to match my style preferences. 注意：我修改了一些代码以匹配我的样式首选项。

def slowInt(i: Int): Future[Int] = Future { Thread.sleep(200); i }
def slowAdd(fx: Future[Int], fy: Future[Int]): Future[Int] = fx.flatMap(x => fy.map { y => Thread.sleep(100); x + y })
var futures: List[Future[Int]] = List.range(1, 21).map(slowInt)

The code below uses the combine function for your case: 下面的代码使用了适用于您的情况的combine功能：

timeFuture(combine(futures, slowAdd))

The code below updates your Future.sequence example for my modifications: 下面的代码更新了我的Future.sequence示例以进行修改：

timeFuture(Future.sequence(futures).map(_.reduce{(x, y) => Thread.sleep(100); x + y }))

使用关联和可交换运算符折叠/减少期货清单

问题描述

3 个解决方案

解决方案1
3 已采纳 2014-04-03 21:33:52

解决方案2
1 2014-03-30 16:32:27

解决方案3
1

使用关联和可交换运算符折叠/减少期货清单

问题描述

3 个解决方案

解决方案1 3 已采纳 2014-04-03 21:33:52

解决方案2 1 2014-03-30 16:32:27

解决方案3 1

解决方案1
3 已采纳 2014-04-03 21:33:52

解决方案2
1 2014-03-30 16:32:27

解决方案3
1