简体   繁体   中英

How to calculate ratio via akka-stream

Looking how ratio (percentage) can be calculated per each element in a stream.

(10,20,30,40,50)->(10/150, 20/150, 30/150, 40/150, 50/150) 150 is reduced sum of elements in a stream

Graph should reduce stream to one element and then apply that one element to each element in a stream

I was thinking about broadcast(2) stream, then in (1) make a reduce (calculate sum), (2) should be the same, and then zip it somehow. Problem that zip is 1:1 combining.

Since you say the data is finite (implication: the upstream source will complete), something like this (in Scala) will work.

def normalizeToTotal(source: Source[Int, Any]): Source[Double, NotUsed] =
  source.map(i => Option(i))  // map everything to Some...
    .concat(Source.single(None)) // so we can use None to signal upstream completion
    .statefulMapConcat { () =>
      var elems: List[Int] = Nil
      { elem: Option[Int] =>
        elem.foreach { e => elems = e :: elems }  // only when not yet completed
        if (elem.isEmpty) {
          // upstream is completing (None is the last element)
          val des = elems.map(_.toDouble)
          val sum = des.sum
          val toEmit = des.reverse.map(_ / sum)
          elems = Nil  // preserve our invariant even in death...
          toEmit
        } else {
          // not yet completed, don't emit
          Nil
        }
      }
    }

Disclaimer: the compiler in my mind passes this.

It needs to be noted that this will consume memory proportional to the number of elements in the stream (due to the requirement to not emit until all elements are known): this is not a streaming algorithm, but a batch algorithm implemented to a streaming API.

(then again, if a stream can be viewed as a stream of small batches (I see you, Spark...), batch processing can equally be viewed as a stream that's most often "dry")

It can also be noted that the statefulMapConcat stage (as long as it maintains its invariant) will work with an infinite stream of Option[Int] s, interpreting None as an emit-at-end-of-batch indicator. It may still be useful to concat(Source.single(None)) on its input to ensure batch termination if modifying it to consume such a stream, of course.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM