简体   繁体   中英

Add retry mechanism for bolt on Apache Storm

I have a bolt (dispatcher) in my storm topology which open http request connection.

I want to add retry mechanism in-case of failure (connection time out, fail status, etc..). The retry should occur only in the dispatcher-bolt and not to start over the whole topology.

usually what I would do is to add a queue which would be responsible for the retry and exception handling (for example after 3 time automatically dispatch the message to an error queue..)

Is it OK to do such thing inside a bolt? anyone has any experience with that and could suggest which library I could use?

Sure! That seems like a reasonable way to handle errors. I'm not sure what library you would need to use except for the one which provides the API for connecting to the queuing system of your choice.

Inside your bolt, you might have code which looks like this:

public void execute(Tuple tuple, BasicOutputCollector collector) {
   try {
      // do something which might fail here...
   } catch (Exception e) {
      // do you want to log the error?
      LOG.error("Bolt error {}", e);
      // do you want the error to show up in storm UI?
      // or just put information on the queue for processing later

As long as you are catching the exception inside your bolt, your topology will not restart.

Another option is to leverage Storm's built-in ability for guaranteed message processing to fail tuples and retry them that way.

package banktransactions;

import java.util.HashMap;
import java.util.Map;
import java.util.Random;

import org.apache.log4j.Logger;

import backtype.storm.spout.SpoutOutputCollector;
import backtype.storm.task.TopologyContext;
import backtype.storm.topology.OutputFieldsDeclarer;
import backtype.storm.topology.base.BaseRichSpout;
import backtype.storm.tuple.Fields;
import backtype.storm.tuple.Values;

public class TransactionsSpouts extends BaseRichSpout{

private static final Integer MAX_FAILS = 2;
Map<Integer,String> messages;
Map<Integer,Integer> transactionFailureCount;
Map<Integer,String> toSend;
private SpoutOutputCollector collector;  

static Logger LOG = Logger.getLogger(TransactionsSpouts.class);

public void ack(Object msgId) {
    LOG.info("Message fully processed ["+msgId+"]");

public void close() {


public void fail(Object msgId) {
        throw new RuntimeException("Error, transaction id not found ["+msgId+"]");
    Integer transactionId = (Integer) msgId;

    //Get the transactions fail
    Integer failures = transactionFailureCount.get(transactionId) + 1;
    if(failures >= MAX_FAILS){
        //If exceeds the max fails will go down the topology
        throw new RuntimeException("Error, transaction id ["+transactionId+"] has had many errors ["+failures+"]");
    //If not exceeds the max fails we save the new fails quantity and re-send the message 
    transactionFailureCount.put(transactionId, failures);
    LOG.info("Re-sending message ["+msgId+"]");

public void nextTuple() {
        for(Map.Entry<Integer, String> transactionEntry : toSend.entrySet()){
            Integer transactionId = transactionEntry.getKey();
            String transactionMessage = transactionEntry.getValue();
            collector.emit(new Values(transactionMessage),transactionId);
         * The nextTuple, ack and fail methods run in the same loop, so
         * we can considerate the clear method atomic
    try {
    } catch (InterruptedException e) {}

public void open(Map conf, TopologyContext context,
        SpoutOutputCollector collector) {
    Random random = new Random();
    messages = new HashMap<Integer, String>();
    toSend = new HashMap<Integer, String>();
    transactionFailureCount = new HashMap<Integer, Integer>();
    for(int i = 0; i< 100; i++){
        messages.put(i, "transaction_"+random.nextInt());
        transactionFailureCount.put(i, 0);
    this.collector = collector;

public void declareOutputFields(OutputFieldsDeclarer declarer) {
    declarer.declare(new Fields("transactionMessage"));


The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM