简体   繁体   中英

Pykafka - sending messages and receiving acknowledgments asynchronously

PyKafka has the limitation that:

delivery report queue is thread-local: it will only serve reports for messages which were produced from the current thread

I'm trying to write a script where I can asynchronously send messages using one function, and keep receiving acknowledgments via another function.

Here are the functions:

def SendRequest(producer):

        count=0        
        while True:
            count += 1
            producer.produce('test msg', partition_key='{}'.format(count))
            if count == 50000:
                  endtime=datetime.datetime.now()
                  print "EndTime : ",endtime
                  print "Done sending all messages.Waiting for response now"
                  return



def GetResponse(producer):

    count_response=0

    while True:

              try:
                  msg, exc = producer.get_delivery_report(block=False)
                  if exc is not None:
                      count_response+=1
                      print 'Failed to deliver msg {}: {}'.format(
                          msg.partition_key, repr(exc))
                  else:
                      print "Count Res :",count_response
                      count_response+=1

              except Queue.Empty:
                  pass

              except Exception,e:
                  print "Unhandled exception : ",e

Threading and multiprocessing did not help. These above two functions need to be running asynchronously/in parallel. What approach shall be used here?

Question : where I can asynchronously send messages ... and keep receiving acknowledgments

This solution with asyncio.coroutine will met your needs.

Note: There are a few drawbacks!

  • This asyncio code needs at least Python 3.5
  • For every Message, a new Task is created

This implements the class AsyncProduceReport() :

import asyncio
from pykafka import KafkaClient
import queue, datetime

class AsyncProduceReport(object):
    def __init__(self, topic):
        self.client = KafkaClient(hosts='127.0.0.1:9092')
        self.topic = self.client.topics[bytes(topic, encoding='utf-8')]
        self.producer = self.topic.get_producer(delivery_reports=True)
        self._tasks = 0

    # async
    @asyncio.coroutine
    def produce(self, msg, id):
        print("AsyncProduceReport::produce({})".format(id))
        self._tasks += 1
        self.producer.produce(bytes(msg, encoding='utf-8'))

        # await - resume next awaiting task
        result = yield from self.get_delivery_report(id)

        self._tasks -= 1
        # This return values are passed to self.callback(task)
        return id, result

    def get_delivery_report(self, id):
        """
         This part of a Task, runs as long as of receiving the delivery_report
        :param id: ID of Message
        :return: True on Success else False
        """
        print("{}".format('AsyncProduceReport::get_delivery_report({})'.format(id)))

        while True:
            try:
                msg, exc = self.producer.get_delivery_report(block=False)
                return (not exc, exc)

            except queue.Empty:
                # await - resume next awaiting task
                yield from asyncio.sleep(1)

    @staticmethod
    def callback(task):
        """
         Processing Task Results
        :param task: Holds the Return values from self.produce(...)
        :return: None
        """
        try:
            id, result = task.result()
            print("AsyncProduceReport::callback: Msg:{} delivery_report:{}"
                    .format(id, result))
        except Exception as e:
            print(e)

    def ensure_futures(self):
        """
         This is the first Task
         Creates a new taks for every Message
        :return: None
        """

        # Create 3 Tasks for this testcase
        for id in range(1, 4):
            # Schedule the execution of self.produce(id): wrap it in a future. 
            # Return a Task object.
            # The task will resumed at the next await
            task = asyncio.ensure_future(self.produce('test msg {} {}'
                     .format(id, datetime.datetime.now()), id))

            # Add a Result Callback function
            task.add_done_callback(self.callback)

            # await - resume next awaiting task
            # This sleep value could be 0 - Only for this testcase == 5
            # Raising this value, will give more time for waiting tasks
            yield from asyncio.sleep(5)
            # print('Created task {}...'.format(_id))

        # await - all tasks completed
        while self._tasks > 0:
            yield from asyncio.sleep(1)

Usage :

if __name__ == '__main__':
    client = AsyncProduceReport('topic01')        
    loop = asyncio.get_event_loop()
    loop.run_until_complete(client.ensure_futures())
    loop.close()
    print("{}".format('EXIT main()'))

Qutput :

 AsyncProduceReport::produce(1) AsyncProduceReport::get_delivery_report(1) AsyncProduceReport::produce(2) AsyncProduceReport::get_delivery_report(2) AsyncProduceReport::callback: Msg:1 delivery_report:(True, None) AsyncProduceReport::produce(3) AsyncProduceReport::get_delivery_report(3) AsyncProduceReport::callback: Msg:2 delivery_report:(True, None) AsyncProduceReport::callback: Msg:3 delivery_report:(True, None)

Tested with Python:3.5.3 - pykafka:2.7.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM