[英]Akka Streams recreate stream in case of stage failure
I have very simple Akka Streams flow which reads msg from Kafka using alpakka, performs some manipulation on msg and indexes it to Elasticsearch. 我有一个非常简单的Akka Streams流,它使用alpakka从Kafka读取msg,对msg进行一些操作并将其索引到Elasticsearch。
I'm using CommitableSource, therefore i'm in At-Least-Once strategy. 我正在使用CommitableSource,因此我处于至少一次战略。 I commit my offset only when index to ES succeed, if it fails I will read again the message because form latest known offset.
我仅在对ES的索引成功时才提交偏移,如果失败,我将再次读取该消息,因为形成了最新的已知偏移。
val decider: Supervision.Decider = {
case _:Throwable => Supervision.Restart
case _ => Supervision.Restart
}
val config: Config = context.system.settings.config.getConfig("akka.kafka.consumer")
val flow: Flow[CommittableMessage[String, String], Done, NotUsed] =
Flow[CommittableMessage[String,String]].
map(msg => Event(msg.committableOffset,Success(Json.parse(msg.record.value()))))
.mapAsync(10) { event => indexEvent(event.json.get).map(f=> event.copy(json = f))}
.mapAsync(10)(f => {
f.json match {
case Success(_)=> f.committableOffset.commitScaladsl()
case Failure(ex) => throw new StreamFailedException(ex.getMessage,ex)
}
})
val r: Flow[CommittableMessage[String, String], Done, NotUsed] = RestartFlow.onFailuresWithBackoff(
minBackoff = 3.seconds,
maxBackoff = 3.seconds,
randomFactor = 0.2, // adds 20% "noise" to vary the intervals slightly
maxRestarts = 20 // limits the amount of restarts to 20
)(() => {
println("Creating flow")
flow
})
val consumerSettings: ConsumerSettings[String, String] =
ConsumerSettings(config, new StringDeserializer, new StringDeserializer)
.withBootstrapServers("localhost:9092")
.withGroupId("group1")
.withProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest")
val restartSource: Source[CommittableMessage[String, String], NotUsed] = RestartSource.withBackoff(
minBackoff = 3.seconds,
maxBackoff = 30.seconds,
randomFactor = 0.2, // adds 20% "noise" to vary the intervals slightly
maxRestarts = 20 // limits the amount of restarts to 20
) {() =>
Consumer.committableSource(consumerSettings, Subscriptions.topics("test"))
}
implicit val mat: ActorMaterializer = ActorMaterializer(ActorMaterializerSettings(context.system).withSupervisionStrategy(decider))
restartSource
.via(flow)
.toMat(Sink.ignore)(Keep.both).run()
What I would like to achieve, is to restart entire flow Source -> Flow-> Sink. 我想要实现的是重新启动整个流Source-> Flow-> Sink。 If from any reason I was no able to index message in Elastic.
如果出于某种原因我无法在Elastic中索引消息。
I tried the following: 我尝试了以下方法:
Supervision.Decider
- It looks like flow was recreated but no message was pulled from Kafka, obviously because it remembers it offset. Supervision.Decider
似乎重新创建了流程,但没有从Kafka提取任何消息,显然是因为它记得偏移量。 RestartSource
- doesn't looks ether, because exception happens in flow stage. RestartSource
看起来不太好,因为异常发生在流阶段。 RestartFlow
- Doesn't help as well because it restarts only Flow, but I need to restart Source from last successful offset. RestartFlow
也无济于事,因为它仅重新启动Flow,但是我需要从上一次成功的偏移量重新启动Source。 Is there any elegant way to do that? 有什么优雅的方法吗?
You can combine restartable source, flow & sink. 您可以组合可重新启动的源,流和接收器。 Nobody prevents you from doing restartable source/flow/sink for each part of the graph
没有人阻止您对图形的每个部分执行可重新启动的源/流/接收器
Update : 更新 :
code example 代码示例
val sourceFactory = () => Source(1 to 10).via(Flow.fromFunction(x => { println("problematic flow"); x }))
RestartSource.withBackoff(4.seconds, 4.seconds, 0.2)(sourceFactory)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.