简体   繁体   中英

Python produce to different Kafka partition

I am trying to learn Kafka by taking the classic Twitter streaming example. I am trying to use my producer to stream twitter data based on 2 filters to different partition of same topic. For example, twitter data with tracks='Google' to one partition and track='Apple' to another.

class Producer(StreamListener):
    def __init__(self, producer):
        self.producer = producer

    def on_data(self, data):
        self.producer.send(topic_name, value=data)
        return True

    def on_error(self, error):
        print(error)


twitter_stream = Stream(auth, Producer(producer))
twitter_stream.filter(track=["Google"])

How do i add another track and stream that data to another partition.

Likewise, how do i make my consumer consume from a specific partition.

consumer = KafkaConsumer(
    topic_name,
     bootstrap_servers=['localhost:9092'],
     auto_offset_reset='latest',
     enable_auto_commit=True,
     auto_commit_interval_ms =  5000,
     max_poll_records = 100,
     value_deserializer=lambda x: json.loads(x.decode('utf-8')))

After some research, I was able to resolve this issue:

In the producer side, specify the partition:

self.producer.send(topic_name, value=data,partition=0)

In the consumer side,

consumer = KafkaConsumer(
       bootstrap_servers=['localhost:9092'],
     auto_offset_reset='latest',
     enable_auto_commit=True,
     auto_commit_interval_ms =  5000,
     max_poll_records = 100,
     value_deserializer=lambda x: json.loads(x.decode('utf-8')))
consumer.assign([TopicPartition('trial', 0)])

Kafka partitions data on the key of the message. In your given code, you are only passing in a value to the Producer message, so the key will be null, and therefore will round-robin between all partitions.

Refer the documentation for your Kafka library to see how you can give a key for each message

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM