简体繁体中英

Delayed Queue implementation in Storm – Kafka, Cassandra, Redis or Beanstalk?

原文 2016-02-13 20:49:31 0 3 java/ message-queue/ apache-kafka/ apache-storm/ delayed-execution

I have a storm topology to process messages from Kafka and make HTTP call / saves in Cassandra based on the task in hand. I process the messages as soon as they come. How ever few messages are not processed completely due to the response form external sources such as an HTTP. I would like to implement a exponential backoff mechanism for retrial in-case HTTP server does not respond/returns an error message to retry after some time. I could think of few ideas using which I could achieve them. I would like to know which of them will be a better solution also if there is any other solution that I can use which is fault tolerant. Since this is used to implement an exponential backoff each message will have a different delay time.

Send it another topic in Kafka which is consumed later. My preferred Solution . I know we can use Kafka offset so consume the message at a latter stage. How ever I could not find documentation/Sample code to do the same. It will be really helpful if any one can help me out with this.
Write the message Cassandra / Redis and write a scheduler to fetch the messages which are not processed and are ready to be consumed and Send it to Kafka so that my storm topology can consume it. (Existing solution in other legacy project(Non Storm))
Send to Beanstalk with Delay (Existing solution in other legacy project(Non Storm). How ever I would like to avoid using this solution and use it only in case I am out of option).

While this is pretty much what I would like to do. I am not able to find documentation to implement delayProcessingUntil as mentioned in Kafka - Delayed Queue implementation using high level consumer

I have done scheduled job from Data-store and delay using Beanstalk in the past, but I would prefer to use Kafka.

3 answers

I think your use case describes the need for a database rather than a queue. You want to temporarily store records until their time and then remove them so they don't show up in future searches. Trying to do that in a queue would be awkward at best, as your analysis shows.

I suggest you create another column family in Cassandra to hold these delayed requests. You'd store the request itself along with a time to retry. Whether you'd want to also have a time series of failed HTTP attempts and related data is up to you. As a delayed request is finally fulfilled, you'd delete the corresponding row from the CF. The search for delayed requests is straightforward, too.

Of course, any database, even a file on the local drive or in HDFS could work, too.

Kafka spout has an exponential backoff message retry built-in. You can configure initial delay, delay multiplier and maximum delay through spout configuration. If there is an error in the bolt, you can call collector.fail(input). After that you just leave it to spout to do the retry.

https://github.com/apache/storm/blob/v0.10.0/external/storm-kafka/src/jvm/storm/kafka/ExponentialBackoffMsgRetryManager.java

You might be interested in the Kafka Retry project https://github.com/IBM/kafka-retry . It provides a delayed retry queue using a single retry topic.

Delayed queue / message processing in Storm

Kafka - Delayed Queue implementation using high level consumer

Consuming data from Kafka queue using Storm Toplology

Java Huge csv file processing and storing using Apache Spark/ Kafka/ Storm to Cassandra

Kafka consumer- Pause polling of event from specific kafka topic partition to use it as delayed queue

Kafka + Storm - satisfying dependencies

Kafka Storm Integration

Dynamic cassandra connection in storm bolt

Delayed ACK in Spring Kafka

BufferUnderflowException from storm kafka spout

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Delayed queue / message processing in Storm Kafka - Delayed Queue implementation using high level consumer Consuming data from Kafka queue using Storm Toplology Java Huge csv file processing and storing using Apache Spark/ Kafka/ Storm to Cassandra Kafka consumer- Pause polling of event from specific kafka topic partition to use it as delayed queue Kafka + Storm - satisfying dependencies Kafka Storm Integration Dynamic cassandra connection in storm bolt Delayed ACK in Spring Kafka BufferUnderflowException from storm kafka spout

Related Tags

Delayed Queue implementation in Storm – Kafka, Cassandra, Redis or Beanstalk?

Question

3 answers

solution1
1 2016-02-14 14:42:00

solution2
1 ACCPTED 2016-02-14 17:53:04

solution3
0 2020-11-20 12:54:49

Delayed Queue implementation in Storm – Kafka, Cassandra, Redis or Beanstalk?

Question

3 answers

solution1 1 2016-02-14 14:42:00

solution2 1 ACCPTED 2016-02-14 17:53:04

solution3 0 2020-11-20 12:54:49

solution1
1 2016-02-14 14:42:00

solution2
1 ACCPTED 2016-02-14 17:53:04

solution3
0 2020-11-20 12:54:49