简体   繁体   English

Python Confluent-kafka DeserializingConsumer 没有从 Kafka 主题中读取消息,即使消息存在

[英]Python Confluent-kafka DeserializingConsumer is not reading the messages from Kafka topic, even though the messages are present

I have a topic with avro schema, I am producing the messages via Python code and it works completely fine.我有一个带有 avro 模式的主题,我通过 Python 代码生成消息并且它工作得很好。 When I consume the messages from CLI, I can consume them successfully without errors.当我使用来自 CLI 的消息时,我可以成功使用它们而不会出错。

When I am trying to consume via Python code, it prints 'None', basically it tries to read but gets none, I tried to print the offset and it throws '-1001'.当我尝试通过 Python 代码使用时,它打印“无”,基本上它尝试读取但没有读取,我尝试打印偏移量并抛出“-1001”。

The method aims to read all the latest messages, create a list with those messages and returns the list.该方法旨在读取所有最新消息,创建包含这些消息的列表并返回列表。

Note:- I have also tried usinng 'enable.auto.commit' = True, but it didn't work so removed it from my config.注意:- 我也尝试过使用 usinng 'enable.auto.commit' = True,但它不起作用,因此将其从我的配置中删除。

Library in requirement.txt = confluent-kafka[avro]>=1.4.2 requirement.txt 中的库 = confluent-kafka[avro]>=1.4.2

   conf = {
    'bootstrap.servers': 'dummyvalue',
    'security.protocol': 'dummyvalue',
    'sasl.mechanism': 'PLAIN',
    'sasl.username': 'dummyvalue',
    'sasl.password': 'dummyvalue',
    'session.timeout.ms': 45000,
    'schema.registry.url': 'dummyvalue',
    'basic.auth.credentials.source': 'dummyvalue',
    'basic.auth.user.info': 'dummyvalue',
    'use.latest.version': True
} 



 schema_registry_conf = {
    'url': conf['schema.registry.url'],
    'basic.auth.user.info': conf['basic.auth.user.info']
 }



    def _set_consumer_config(self, conf, avro_deserializer):
        consumer_conf = self._popSchemaRegistryParamsFromConfig(conf) 
        #above method will remove unnecessary configs from main conf dictionary so consumer_conf has only relevant properties
        consumer_conf['value.deserializer'] = avro_deserializer
        consumer_conf['group.id'] = "python_example"
        consumer_conf['auto.offset.reset'] = 'latest'    
        return consumer_conf

    
    def get_list_of_unconsumed_msgs(self, topic):

        text_file = open('avro schema file path')
        avro_schema = text_file.read()
        schema_registry_client = SchemaRegistryClient(schema_registry_conf)
        avro_deserializer = AvroDeserializer(schema_registry_client,avro_schema)
        consumer = DeserializingConsumer(self._set_consumer_config(conf, avro_deserializer))
        consumer.subscribe([topic])
        messages = []
        polling_count = 5
        while polling_count >= 1:
            try:
                print(consumer.position([TopicPartition(topic, 0)]))
                print(f"Consumer Committed {consumer.committed([TopicPartition(topic, 0)])}")
                print(f"Consumer Assignment {consumer.assignment()}")
                msg = consumer.poll(3.0)
                if msg is None:
                    polling_count = polling_count - 1
                    continue
                elif msg.error():
                    print('error: {}'.format(msg.error()))
                else:
                    messages.append([msg.value()])
            except SerializerError as e:
                # Report malformed record, discard results, continue polling
                print("Message deserialization failed {}".format(e))
        consumer.close()
        return messages

    def main():
        msg = {}
        topic_name = "aa_automation_test"
        msg = obj.get_list_of_unconsumed_msgs(topic)
        print(f"Received Message as :- {msg}")

Output of print statement:打印语句的输出:

[Prints an empty list, for debugging I have printed offset and it throws -1001]
[TopicPartition{topic=aa_automation_test,partition=0,offset=-1001,error=None}]
Consumer Committed [TopicPartition{topic=aa_automation_test,partition=0,offset=-1001,error=None}]
Consumer Assignment [] 
Received Message as :- []

You're seeing default values.您看到的是默认值。

There won't be any data until you actually poll and connect to the brokers.在您实际poll并连接到代理之前,不会有任何数据。

Kafka keeps track of uncommitted offsets on its own. Kafka 自己跟踪未提交的偏移量。 You don't need to implement that logic in your app.您不需要在您的应用程序中实现该逻辑。

If you print an empty list at the end, from what you've shown, that means you have reached the end of the topic.如果您在最后打印一个空列表,从您所显示的内容来看,这意味着您已经到达主题的结尾。

To pull 5 at a time, see consume(num_messages)要一次拉 5,请参阅consume(num_messages)

To check the (end) offsets for partitions, get_watermark_offsets , which you can subtract off from consumer.committed() to see lag.要检查分区的(结束)偏移量, get_watermark_offsets ,您可以从consumer.committed()中减去它以查看滞后。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM