简体   繁体   English

如果阶段失败,Akka Streams将重新创建流

[英]Akka Streams recreate stream in case of stage failure

I have very simple Akka Streams flow which reads msg from Kafka using alpakka, performs some manipulation on msg and indexes it to Elasticsearch. 我有一个非常简单的Akka Streams流,它使用alpakka从Kafka读取msg,对msg进行一些操作并将其索引到Elasticsearch。

I'm using CommitableSource, therefore i'm in At-Least-Once strategy. 我正在使用CommitableSource,因此我处于至少一次战略。 I commit my offset only when index to ES succeed, if it fails I will read again the message because form latest known offset. 我仅在对ES的索引成功时才提交偏移,如果失败,我将再次读取该消息,因为形成了最新的已知偏移。

 val decider: Supervision.Decider = {
    case _:Throwable =>  Supervision.Restart
    case _           => Supervision.Restart
  }

  val config: Config = context.system.settings.config.getConfig("akka.kafka.consumer")

  val flow: Flow[CommittableMessage[String, String], Done, NotUsed] =
    Flow[CommittableMessage[String,String]].
      map(msg => Event(msg.committableOffset,Success(Json.parse(msg.record.value()))))
    .mapAsync(10) { event => indexEvent(event.json.get).map(f=> event.copy(json = f))}
      .mapAsync(10)(f => {
    f.json match {
      case Success(_)=> f.committableOffset.commitScaladsl()
      case Failure(ex) => throw new StreamFailedException(ex.getMessage,ex)
    }
      })

  val r: Flow[CommittableMessage[String, String], Done, NotUsed] = RestartFlow.onFailuresWithBackoff(
    minBackoff = 3.seconds,
    maxBackoff = 3.seconds,
    randomFactor = 0.2, // adds 20% "noise" to vary the intervals slightly
    maxRestarts = 20 // limits the amount of restarts to 20
  )(() => {
    println("Creating flow")
    flow
  })

  val consumerSettings: ConsumerSettings[String, String] =
    ConsumerSettings(config, new StringDeserializer, new StringDeserializer)
      .withBootstrapServers("localhost:9092")
      .withGroupId("group1")
      .withProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest")

  val restartSource: Source[CommittableMessage[String, String], NotUsed] = RestartSource.withBackoff(
    minBackoff = 3.seconds,
    maxBackoff = 30.seconds,
    randomFactor = 0.2, // adds 20% "noise" to vary the intervals slightly
    maxRestarts = 20 // limits the amount of restarts to 20
  ) {() =>
    Consumer.committableSource(consumerSettings, Subscriptions.topics("test"))
  }


  implicit val mat: ActorMaterializer = ActorMaterializer(ActorMaterializerSettings(context.system).withSupervisionStrategy(decider))



  restartSource
    .via(flow)
    .toMat(Sink.ignore)(Keep.both).run()

What I would like to achieve, is to restart entire flow Source -> Flow-> Sink. 我想要实现的是重新启动整个流Source-> Flow-> Sink。 If from any reason I was no able to index message in Elastic. 如果出于某种原因我无法在Elastic中索引消息。

I tried the following: 我尝试了以下方法:

  • Supervision.Decider - It looks like flow was recreated but no message was pulled from Kafka, obviously because it remembers it offset. Supervision.Decider似乎重新创建了流程,但没有从Kafka提取任何消息,显然是因为它记得偏移量。
  • RestartSource - doesn't looks ether, because exception happens in flow stage. RestartSource看起来不太好,因为异常发生在流阶段。
  • RestartFlow - Doesn't help as well because it restarts only Flow, but I need to restart Source from last successful offset. RestartFlow也无济于事,因为它仅重新启动Flow,但是我需要从上一次成功的偏移量重新启动Source。

Is there any elegant way to do that? 有什么优雅的方法吗?

You can combine restartable source, flow & sink. 您可以组合可重新启动的源,流和接收器。 Nobody prevents you from doing restartable source/flow/sink for each part of the graph 没有人阻止您对图形的每个部分执行可重新启动的源/流/接收器

Update : 更新

code example 代码示例

val sourceFactory = () => Source(1 to 10).via(Flow.fromFunction(x => { println("problematic flow"); x }))
RestartSource.withBackoff(4.seconds, 4.seconds, 0.2)(sourceFactory)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM