简体   繁体   中英

Replicating messages from one Kafka topic to another kafka topic

I want to make a flow from a Kafka cluster/topic in thr prod cluster into another Kafka cluster in the dev environment for scalability and regrrssion testing.

For the duck-tape solution, I cascade a Kafka consumer and producer, but my instinct tells me that there should be a better way. But, I couldn't find any good solution yet. Can anyone help me?

If you want to replicate data from one cluster to another then there is one kafka tool called MirrorMaker .

Kafka comes with a tool for mirroring data between Kafka clusters. The tool reads from a source cluster and writes to a destination cluster. Data will be read from topics in the source cluster and written to a topic with the same name in the destination cluster.

Here is syntax to run MirrorMaker tool:

bin/kafka-run-class.sh kafka.tools.MirrorMaker
       --consumer.config consumer.properties
       --producer.config producer.properties --whitelist my-topic

You can find this script in kafka installation directory. Here you need to provide consumer.properties of your source cluster and producer.properties of your destination cluster . You can whitelist which topics should be mirrored through --whitelist option.

You can find more information about Mirroring data between clusters

Note: MirrorMaker copies data into same topic_name in destination cluster as source cluster

While mirror makes works perfect for across the cluster solution, however, for same cluster your ducktap solution is not bad as MirrorMaker assumes you are pulling from one cluster to another cluster.

So a solution where you simply want to copy data between different topics in the same cluster, kafkacat is your friend.

export BOOTSTRAP_SERVERS=localhost:9096
export SOURCE_TOPIC=source_topic
export TARGET_TOPIC=target_topic

kafkacat -C -b $BOOTSTRAP_SERVERS -o beginning -e -t $SOURCE_TOPIC  | kafkacat -P -b $BOOTSTRAP_SERVERS  -t $TARGET_TOPIC

Kafka is basically a messaging queue, therefore it has a passive behavior: something has to put messages into it ( producer ), and something has to pull messages from it ( consumer ).

If you want to make a sort of pipeline between two kafka topics, so that messages from one topic will go automatically to the other topic, you'll need some code which will have the properties of consumer from the first topic and a producer to the second topic.

Depending on your programming language, you can choose between some ready-to-go well documented producer and/or consumer solutions.

For more sophisticated cases you can check out Apache Storm , etc.

And if you need to copy messages from one topic into another topic, maybe with some additional logic or transformation, you can also use Kafka Streams .

https://docs.confluent.io/current/streams/index.html

And examples

https://github.com/confluentinc/kafka-streams-examples/blob/5.4.0-post/src/main/java/io/confluent/examples/streams/WordCountLambdaExample.java

Alternatively, check MirrorMaker .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM