简体繁体中英

Spark streaming maintain kafka offset periodically as it processes

原文 2016-05-24 05:25:33 0 1 apache-spark/ streaming/ apache-kafka/ offset

In spark streaming direct-approach from kafka, there is a way by which I can know the kafka offset level ranges. However if I would like periodically maintain offset level so that if needed I can reprocess items from a offset. Is there any way I can retrieve offset of a message in rdd while I am processing each message? Eg With offsetranges, I have start and end offset for the RDD, but what if while processing each record of the RDD system encounters and error and job ends. Now if I want to begin processing from the record that failed, how do I first save the last successful offset so that I can start with that when starting next time.

1 answers

With Spark 1.3 released there is a new direct approach (no receiver) that hides this low-level complexity behind the scenes. In case of a failure and sufficient Kafka retention messages can be recovered from Kafka automatically after restart.

Spark Streaming Kafka initial offset

Spark Streaming kafka offset manage

Spark Structured Streaming Kafka Offset Management

Spark Structured Streaming Kafka Integration Offset management

Spark Structured Streaming Kafka error -- offset was changed

Rewind Offset Spark Structured Streaming from Kafka

Spark streaming app resets kafka offset continuously

Spark Structured Streaming - kafka offset handling

Spark Structured Streaming NOT process Kafka offset expires

In Spark Structured streaming with Kafka, how spark manages offset for multiple topics

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Spark Streaming Kafka initial offset Spark Streaming kafka offset manage Spark Structured Streaming Kafka Offset Management Spark Structured Streaming Kafka Integration Offset management Spark Structured Streaming Kafka error -- offset was changed Rewind Offset Spark Structured Streaming from Kafka Spark streaming app resets kafka offset continuously Spark Structured Streaming - kafka offset handling Spark Structured Streaming NOT process Kafka offset expires In Spark Structured streaming with Kafka, how spark manages offset for multiple topics

Related Tags

Spark streaming maintain kafka offset periodically as it processes

Question

1 answers

solution1 0 2016-05-25 11:42:04

solution1
0 2016-05-25 11:42:04