简体   繁体   English

如何通过 akka-stream 计算比率

[英]How to calculate ratio via akka-stream

Looking how ratio (percentage) can be calculated per each element in a stream.查看如何计算 stream 中每个元素的比率(百分比)。

(10,20,30,40,50)->(10/150, 20/150, 30/150, 40/150, 50/150) 150 is reduced sum of elements in a stream (10,20,30,40,50)->(10/150, 20/150, 30/150, 40/150, 50/150) 150是 stream 中元素的减少总和

Graph should reduce stream to one element and then apply that one element to each element in a stream图表应将 stream 减少为一个元素,然后将该元素应用于 stream 中的每个元素

I was thinking about broadcast(2) stream, then in (1) make a reduce (calculate sum), (2) should be the same, and then zip it somehow.我在考虑广播(2)stream,然后在(1)中进行减少(计算总和),(2)应该相同,然后以某种方式使用 zip。 Problem that zip is 1:1 combining. zip 是 1:1 组合的问题。

Since you say the data is finite (implication: the upstream source will complete), something like this (in Scala) will work.由于您说数据是有限的(暗示:上游源将完成),因此(在 Scala 中)这样的东西会起作用。

def normalizeToTotal(source: Source[Int, Any]): Source[Double, NotUsed] =
  source.map(i => Option(i))  // map everything to Some...
    .concat(Source.single(None)) // so we can use None to signal upstream completion
    .statefulMapConcat { () =>
      var elems: List[Int] = Nil
      { elem: Option[Int] =>
        elem.foreach { e => elems = e :: elems }  // only when not yet completed
        if (elem.isEmpty) {
          // upstream is completing (None is the last element)
          val des = elems.map(_.toDouble)
          val sum = des.sum
          val toEmit = des.reverse.map(_ / sum)
          elems = Nil  // preserve our invariant even in death...
          toEmit
        } else {
          // not yet completed, don't emit
          Nil
        }
      }
    }

Disclaimer: the compiler in my mind passes this.免责声明:我心中的编译器通过了这一点。

It needs to be noted that this will consume memory proportional to the number of elements in the stream (due to the requirement to not emit until all elements are known): this is not a streaming algorithm, but a batch algorithm implemented to a streaming API.需要注意的是,这将消耗 memory 与 stream 中的元素数量成正比(由于要求在所有元素都已知之前不发射):这不是流式算法,而是对流式 ZDB97442387183ACE14D6 实施的批处理算法.

(then again, if a stream can be viewed as a stream of small batches (I see you, Spark...), batch processing can equally be viewed as a stream that's most often "dry") (再说一次,如果 stream 可以被视为小批量的 stream(我看到你了,Spark ......),批处理同样可以被视为 stream 最常见的是“干”B)

It can also be noted that the statefulMapConcat stage (as long as it maintains its invariant) will work with an infinite stream of Option[Int] s, interpreting None as an emit-at-end-of-batch indicator.还可以注意到statefulMapConcat阶段(只要它保持其不变性)将与Option[Int] s 的无限 stream 一起工作,将None解释为批处理结束时发出指示符。 It may still be useful to concat(Source.single(None)) on its input to ensure batch termination if modifying it to consume such a stream, of course.当然,如果修改它以使用这样的 stream,在其输入上concat(Source.single(None))以确保批处理终止可能仍然有用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM