简体   繁体   中英

Processing sequence with duplicates concurrently

Suppose I've got a function fab: A => B , a sequence of A and need to get a sequence of pairs (A, B) like this:

def foo(fab: A => B, as: Seq[A]): Seq[(A, B)] = as.zip(as.map(fab))

Now I want to run fab concurrently using scala.concurrent.Future but I want to run fab only once for all duplicate elements in as . For instance,

val fab: A => B = ...
val a1: A = ...
val a2: A = ...
val as = a1 :: a1 :: a2 :: a1 :: a2 :: Nil
foo(fab, as) // invokes fab twice and run these invocations concurrently

How would you implement it ?

def foo[A, B](as: Seq[A])(f: A => B)(implicit exc: ExecutionContext)
: Future[Seq[(A, B)]] = {
  Future
    .traverse(as.toSet)(a => Future((a, (a, f(a)))))
    .map(abs => as map abs.toMap)
}

Explanation:

  1. as.toSet ensures that f is invoked only once for each a
  2. The (a, (a, f(a))) gives you a set with nested tuples of shape (a, (a, b))
  3. Mapping the original sequence of a s by a Map with pairs (a, (a, b)) gives you a sequence of (a, b) s.

Since your f is not asynchronous anyway, and since you don't mind using futures, you might consider using par -collections as well:

def foo2[A, B](as: Seq[A])(f: A => B): Seq[(A, B)] = {
  as map as.toSet.par.map((a: A) => a -> (a, f(a))).seq.toMap
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM