简体   繁体   中英

kafka-python: produce and consume messages from same topic at the same time by running concurrent process/scripts

Kafka set up locally:

bin/zookeeper-server-start.sh config/zookeeper.properties
bin/kafka-server-start.sh config/server.properties

and example test topic to store data is created:

bin/kafka-topics.sh --create --topic fortest --bootstrap-server localh
ost:9092 --replication-factor 1 --partitions 1

Sample script is created to send example data and then read it from same test topic

import time
from kafka import KafkaProducer, KafkaConsumer
import multiprocessing

TOPIC = 'fortest'
producer = KafkaProducer(bootstrap_servers=['localhost:9092'])

consumer = KafkaConsumer(
    TOPIC,
    bootstrap_servers=['localhost:9092'],
    auto_offset_reset='latest',
    group_id='my-consumer-1'
    )

def store_message():
    for _ in range(100):
        msg = b'message'
        producer.send(topic=TOPIC, value=msg)
        print(f'{msg} sent by Producer')
        time.sleep(3)

def get_processed_message():
    while True:
        messages = consumer.poll(timeout_ms=5000)

        if not messages:
             print('wait for messsages')
             time.sleep(5)
        else:
            print(f"Get messages: {messages.values()}")

It works in consecutive way like:

if __name__ == '__main__':
    store_message()
    get_processed_message()

But the question is - is it possible to run both functions in concurrent way, when producer is constantly sends and consumer is constantly reads messages using same topic at the same time? Tried to do this using multiprocessing:

if __name__ == '__main__':
    produce_initial_message = multiprocessing.Process(target=store_message)
    consume_processed_message = multiprocessing.Process(target=get_processed_message)
    produce_initial_message.start()
    consume_processed_message.start()

but only sending works, consumer.poll() never returns any value in this case and keep 'waiting' for messages. Same if move Consumer initialization and logic to the different .py script and run them both at the same time in different terminals How this needs to be adjusted to work in such way?(Or this requires some more complicated logic/additional agents besides consumer and producer to handle?)

Solved by:

  1. Changing store_message() method to use infinite loop as well with using producer.flush() after each sent message
def store_message():
    while True:
        msg = b'message'
        producer.send(topic=TOPIC, value=msg)
        print(f'{msg} sent by Producer')
        producer.flush()
        time.sleep(3)
  1. Use threading for concurrent running instead of multiprocessing:
if __name__ == '__main__':
    t_producer = threading.Thread(target=store_message)
    t_consumer = threading.Thread(target=get_processed_message)
    t_producer.setDaemon(True)
    t_consumer.setDaemon(True)
    t_producer.start()
    t_consumer.start()
    while True:
        pass

Now it runs just as planned, thank you.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM