what happen to the messages when connection is down between kafka server and producer?

Question

im new in kafka using spring boot and im working in projet that i want to integrate kafka using spring into it , so the problem is i want to send message from the producer to the consumer even if the kafka server don't running (offline mode)

Can anyone give me example of how using kafka in offline mode , i can't find a tuto for this topic i want to stop my kafka server(for example) and in the same time producer want to send data to the topic,then the consumer can get these message? what's the best solution?are they true ?

*sending data to a file, and when the server return to run (i test connexion for example),i export data from the file to the topic

*sending data to database and when the server return to run (test connexion),the same i send message(data) from database to my topic

*using a queue or a list to store message and when the server return to run (test connexion),i send data from the list to the topic but the problem that i have a lot of messages

-->if there are other solution with a simple example ,can anyone help me?

this is example of brocker Redis that we test the connection between Redis brocker and producer , if connection fails, i will store data inside a Queue which can store many messages, and when connection return to work between Redis and producer ,the producer now get these messages from the Queue and send them to the Redis Brocker .

But the problem in this brocker ,there are a few message lost so we decide to integrate kafka brocker inside my project instead of Redis brocker!

Can anyone give me example in java how to store a lot of message before sending them by producer to kafka cluster?or what's the best solution to this problem because we don't want to use the same Queue solution?

this example in python is how store message inside a Queue if connexion failed to the server:

    try:
    urllib.request.urlopen('http://serverAdress', timeout=0.1)
    r.publish(topicProduction,json_background_of_message1)
    print(json_background_of_message1)
    arretControle=Tru
    except Exception as e:
    qArret.put(json_background_of_message1)
    print("arret")
    arretControle=True

//json_background_of_message1 is the Queue that we can store a lot of messages in this Queue before sending these message if the connexion is failed

Answer 1

Kafka is designed to be a highly-available messaging system. Configured correctly, and depending on the replication factor, you can have multiple brokers go down, completely, for days at a time, and the cluster would still work (albeit under higher load probably). Every single Kafka production cluster I've worked with hasn't been completely down, ever, once it was successfully deployed. We've had individual brokers go down, for days at a time sometimes, but that was never a problem.

What you're proposing is a fallback or backup method, in case Kafka is not available. However, you still have the same problem. If you dump messages to a file, how long until you run out of disk space? If you store messages in a database, how long until the database runs out space? If you store messages in an in-memory queue, how long until you run out of memory, and crashes the application? And now you also have to build a mechanism to recover from a kafka outage, which adds complexity and overhead.

The best approach with Kafka is to configure it and handle it as a highly-available-system, configure alerts and metrics properly, so you'll be immediately alerted, and can react promptly, if something goes wrong. Also, you should always size and test your applications so you have enough headroom to handle even your worst case scenario. If you configure it to use replication factor 3, you would be able to lose any two brokers, and the cluster would still be able to function with no data loss.

Now, on the application side, your behavior in case Kafka is unavailable should depend on how important the messages are. If you can tolerate losing messages, then just drop them if producer returns an exception, and log it/send an alert. However, if they're extremely important records, then you shouldn't acknowledge/commit the messages on your upstream system (wherever the records originated from) until you have full confirmation that they're saved in Kafka. I would recommend setting producer acks to -1 or all for this, multiple retries in case of failure, and setting up a proper Callback method on the producer.send() method. See here for much more detailed explanation: https://kafka.apache.org/21/javadoc/index.html?org/apache/kafka/clients/producer/Callback.html

For more details, like others have said, please give the official docs a read: https://kafka.apache.org/documentation/

what happen to the messages when connection is down between kafka server and producer?

Question

1 answers

solution1
0 ACCPTED 2019-02-28 06:46:40

what happen to the messages when connection is down between kafka server and producer?

Question

1 answers

solution1 0 ACCPTED 2019-02-28 06:46:40

solution1
0 ACCPTED 2019-02-28 06:46:40