简体   繁体   English

Python Kafka使用者缺少轮询某些消息

[英]Python Kafka consumer missing to poll some messages

The code for my Kafka consumer looks like this 我的Kafka用户的代码如下所示

def read_messages_from_kafka():
    topic = 'my-topic'
    consumer = KafkaConsumer(
        bootstrap_servers=['my-host1', 'my-host2'],
        client_id='my-client',
        group_id='my-group',
        auto_offset_reset='earliest',
        enable_auto_commit=False,
        api_version=(0, 8, 2)
    )
    consumer.assign([TopicPartition(topic, 0), TopicPartition(topic, 1)])

    messages = consumer.poll(timeout_ms=kafka_config.poll_timeout_ms, max_records=kafka_config.poll_max_records)

    for partition in messages.values():
        for message in partition:
            log.info("read {}".format(message))

    if messages:
        consumer.commit()

    next_offset0, next_offset1 = consumer.position(TopicPartition(topic, 0)), consumer.position(TopicPartition(topic, 1))
    log.info("next offset0={} and offset1={}".format(next_offset0, next_offset1))

while True:
    read_messages_from_kafka()
    sleep(kafka_config.poll_sleep_ms / 1000.0)

I have realised that this setup of consumer is not able to read all the messages. 我已经意识到,这种消费者设置无法读取所有消息。 And I am not able to reproduce this as it's intermittent issue. 我无法重现此问题,因为它是断断续续的问题。

When I compare last 100 messages using kafka-cat to this consumer, I found that my consumer intermittently misses few messages randomly. 当我使用kafka-cat与该使用者比较最后100条消息时,发现我的使用者间歇性地随机丢失了几条消息。 What's wrong with my consumer? 我的消费者怎么了?

kafkacat -C -b my-host1 -X broker.version.fallback=0.8.2.1 -t my-topic -o -100

There are just too many ways to consume messages in python . 在python中使用消息的方式太多了 There should be one and preferably only one obvious way to do it. 应该只有一种,最好只有一种明显的方式来做到这一点。

There is a problem of missing messages in your Kafka client. 您的Kafka客户端中存在缺少消息的问题。 I found solution here : 我在这里找到解决方案:

while True:
    raw_messages = consumer.poll(timeout_ms=1000, max_records=5000)
    for topic_partition, messages in raw_messages.items():
        application_message = json.loads(message.value.decode())

Also there is another Kafka client exists: confluent_kafka. 也存在另一个Kafka客户端: confluent_kafka。 It has no such problem. 它没有这样的问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM