简体   繁体   English

从内部超时迭代器映射的Scala惯用方法?

[英]Scala idiomatic way to timeout an iterator map from within?

I have a producer and consumer architecture where my producer returns an iterator and my consumer is expecting some transformed results. 我有一个生产者和消费者体系结构,我的生产者返回一个迭代器,我的消费者期待一些转换结果。 Both of them are out of my control. 他们两个都不受我的控制。 Now my code is responsible to transform the source stream. 现在我的代码负责转换源流。 One problem is source throughput is unreliable. 一个问题是源吞吐量不可靠。 It will produce records at varying rates. 它将以不同的速率生成记录。 Sometimes too slow. 有时候太慢了。

Is it possible to terminate stream within a map stage? 是否可以在地图阶段终止流? I do have a flag that I can set to kill the process. 我有一个标志,我可以设置杀死这个过程。 I cannot place Futures and timeout outside the consumer BTW. 我不能把期货和超时放在消费者BTW之外。

Things I tried: 我试过的事情:

Hitting kill within a map. 在地图中击杀。 This suffers from drawback when no record generated for a while then this condition is never triggered. 当没有记录生成一段时间然后从未触发此条件时,这会有缺点。

source.map(x=> {if(System.currentTimeMillis()>limit) kill(); x})

Another option is to use a while. 另一种选择是使用一段时间。 But, it can't yield from while. 但是,它不能从而屈服。

while(source.hasNext()){
    Try(Await.result(Future{source.next()}, limit))
    match {
        case _@Failure(e)=> kill()
        case bla..
    }
}

Any innovative ideas for the same? 任何创新的想法都一样吗?

It's a little hard to grasp the situation without more details on the types you're dealing with. 没有关于你正在处理的类型的更多细节,有点难以掌握这种情况。

I wonder if you can't just wrap the source Iterator with your own transformer Iterator . 我想知道,如果你不能只是包装源Iterator与自己的变压器Iterator

class Transformer[A,B](src :A) extends Iterator[B] {
  private var nextB :B = _
  def hasNext :Boolean = {
    // pull next element from src
    // if successful load nextB and return true else return false
  }

  def next() :B = nextB
}

Then you can simply let toStream create a Stream[B] that will, at some point, have a termination. 然后你可以简单地让toStream创建一个Stream[B] ,它会在某个时刻终止。

sendToConsumer((new Transformer(source)).toStream)

Okay I am going to piggy back on jwvh's answer. 好吧,我要回头看看jwvh的回答。 To add in details of the iterator. 添加迭代器的详细信息。 I am using Try to prefetch the result of next so we don't have to time futures twice. 我正在使用Try预取下一个的结果,所以我们没有时间期货两次。 Once for hasNext and once for next. 一次用于hasNext,一次用于下一次。

import scala.concurrent.{Await, Future}
import scala.concurrent.duration.Duration
import scala.concurrent.ExecutionContext.Implicit.global

case class TimedIterator[A](src : Iterator[A], timeout: Duration)
  extends Iterator[Try[A]] {
    private val fail = Failure(new TimeoutException("Iterator timed out after %s".format(timeout.toString)))
    private def fetchNext(): Try[A] = Try(Await.result(Future{src.next()}, timeout))

    private val limitTime = System.currentTimeMillis() + timeout.toMillis
    private var _next: Try[A] = fetchNext()

    def hasNext :Boolean = _next.isSuccess
    def next() : Try[A] = {
      val res = if (System.currentTimeMillis() > limitTime) fail else _next
      _next   = if (res.isSuccess) fetchNext() else res
      res
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM