简体   繁体   English

kafka-python消费者没有收到消息

[英]kafka-python consumer not receiving messages

I am having trouble with KafaConsumer to make it read from the beginning, or from any other explicit offset.我在使用KafaConsumer使其从头或任何其他显式偏移读取时遇到问题。

Running the command line tools for the consumer for the same topic, I do see messages with the --from-beginning option and it hangs otherwise针对同一主题为消费者运行命令行工具,我确实看到了带有--from-beginning选项的消息,否则它会挂起

$ ./kafka-console-consumer.sh --zookeeper {localhost:port} --topic {topic_name} --from-beginning

If I run it through python, it hangs, which I suspect to be caused by incorrect consumer configs如果我通过 python 运行它,它会挂起,我怀疑这是由不正确的消费者配置引起的

consumer = KafkaConsumer(topic_name,
                     bootstrap_servers=['localhost:9092'],
                     group_id=None,
                     auto_commit_enable=False,
                     auto_offset_reset='smallest')

print "Consuming messages from the given topic"
for message in consumer:
    print "Message", message
    if message is not None:
        print message.offset, message.value

print "Quit"

Output: Output:

Consuming messages from the given topic (hangs after that)使用给定主题的消息(之后挂起)

I am using kafka-python 0.9.5 and the broker runs kafka 8.2.我正在使用 kafka-python 0.9.5,代理运行 kafka 8.2。 Not sure what the exact problem is.不确定确切的问题是什么。

Set _group_id=None_ as suggested by dpkp to emulate the behavior of console consumer.按照 dpkp 的建议设置 _group_id=None_ 以模拟控制台消费者的行为。

The difference between the console-consumer and the python consumer code you have posted is the python consumer uses a consumer group to save offsets: group_id="test-consumer-group" .控制台消费者和您发布的 python 消费者代码之间的区别在于 python 消费者使用消费者组来保存偏移量: group_id="test-consumer-group" If instead you set group_id=None, you should see the same behavior as the console consumer.相反,如果您设置 group_id=None,您应该看到与控制台使用者相同的行为。

auto_offset_reset='earliest' 为我解决了这个问题。

auto_offset_reset='earliest'group_id=None为我解决了这个问题。

I ran into the same problem: I can recieve in kafka console but can't get message with python script using package kafka-python .我遇到了同样的问题:我可以在 kafka 控制台中接收,但无法使用包kafka-python使用 python 脚本获取消息。

Finally I figure the reason is that I didn't call producer.flush() and producer.close() in my producer.py which is not mentioned in its documentation .最后我认为原因是我没有在我的producer.py调用producer.flush()producer.close() ,这在其文档中没有提到。

My take is: to print and ensure offset is what you expect it to be.我的看法是:打印并确保偏移量符合您的预期。 By using position() and seek_to_beginning() , please see comments in the code.通过使用position()seek_to_beginning() ,请查看代码中的注释。

I can't explain:我无法解释:

  1. Why after instantiating KafkaConsumer , the partitions are not assigned, is this by design?为什么在实例化KafkaConsumer后没有分配分区,这是设计KafkaConsumer吗? Hack around is to call poll() once before seek_to_beginning() Hack around 是在seek_to_beginning()之前调用poll() seek_to_beginning()
  2. Why sometimes after seek_to_beginning() , first call to poll() returns no data and doesnt change the offset.为什么有时在seek_to_beginning() ,首先调用poll()不返回数据并且不更改偏移量。

Code:代码:

import kafka
print(kafka.__version__)
from kafka import KafkaProducer, KafkaConsumer
from time import sleep
KAFKA_URL = 'localhost:9092' # kafka broker
KAFKA_TOPIC = 'sida3_sdtest_topic' # topic name

# ASSUMING THAT the topic exist

# write to the topic
producer = KafkaProducer(bootstrap_servers=[KAFKA_URL])
for i in range(20):
    producer.send(KAFKA_TOPIC, ('msg' + str(i)).encode() )
producer.flush()

# read from the topic
# auto_offset_reset='earliest', # auto_offset_reset is needed when offset is not found, it's NOT what we need here
consumer = KafkaConsumer(KAFKA_TOPIC,
bootstrap_servers=[KAFKA_URL],
max_poll_records=2,
group_id='sida3'
)

# (!?) wtf, why we need this to get partitions assigned
# AssertionError: No partitions are currently assigned if poll() is not called
consumer.poll()
consumer.seek_to_beginning()

# also AssertionError: No partitions are currently assigned if poll() is not called
print('partitions of the topic: ',consumer.partitions_for_topic(KAFKA_TOPIC))

from kafka import TopicPartition
print('before poll() x2: ')
print(consumer.position(TopicPartition(KAFKA_TOPIC, 0)))
print(consumer.position(TopicPartition(KAFKA_TOPIC, 1)))

# (!?) sometimes the first call to poll() returns nothing and doesnt change the offset
messages = consumer.poll()
sleep(1)
messages = consumer.poll()

print('after poll() x2: ')
print(consumer.position(TopicPartition(KAFKA_TOPIC, 0)))
print(consumer.position(TopicPartition(KAFKA_TOPIC, 1)))

print('messages: ', messages)

Output :输出

2.0.1
partitions of the topic:  {0, 1}
before poll() x2: 
0
0
after poll() x2: 
0
2
messages:  {TopicPartition(topic='sida3_sdtest_topic', partition=1): [ConsumerRecord(topic='sida3_sdtest_topic', partition=1, offset=0, timestamp=1600335075864, timestamp_type=0, key=None, value=b'msg0', headers=[], checksum=None, serialized_key_size=-1, serialized_value_size=4, serialized_header_size=-1), ConsumerRecord(topic='sida3_sdtest_topic', partition=1, offset=1, timestamp=1600335075864, timestamp_type=0, key=None, value=b'msg1', headers=[], checksum=None, serialized_key_size=-1, serialized_value_size=4, serialized_header_size=-1)]}

I faced the same issue before, so I ran kafka-topics locally at the machine running the code to test and I got UnknownHostException.我之前遇到过同样的问题,所以我在运行代码的机器上本地运行 kafka-topics 进行测试,我得到了 UnknownHostException。 I added the IP and the host name in hosts file and it worked fine in both kafka-topics and the code.我在hosts文件中添加了 IP 和主机名,它在 kafka-topics 和代码中都运行良好。 It seems that KafkaConsumer was trying to fetch the messages but failed without raising any exceptions.似乎KafkaConsumer试图获取消息但没有引发任何异常就失败了。

For me, I had to specify the router's IP in the kafka PLAINTEXT configuration.对我来说,我必须在 kafka PLAINTEXT 配置中指定路由器的 IP。

Get the router's IP with:使用以下命令获取路由器的 IP:

echo $(ifconfig | grep -E "([0-9]{1,3}\.){3}[0-9]{1,3}" | grep -v 127.0.0.1 | awk '{ print $2 }' | cut -f2 -d: | head -n1)

and then add PLAINTEXT_HOST://<touter_ip>:9092 to the kafka advertised listeners.然后将PLAINTEXT_HOST://<touter_ip>:9092添加到 kafka 通告的侦听器中。 In case of a confluent docker service the configuration is as follows:如果是 docker 服务,配置如下:

   kafka:
    image: confluentinc/cp-kafka:7.0.1
    container_name: kafka
    depends_on:
      - zookeeper
    ports:
      - 9092:9092
      - 29092:29092
    environment:
      - KAFKA_BROKER_ID=1
      - KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181
      - KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://kafka:29092,PLAINTEXT_HOST://172.28.0.1:9092
      - KAFKA_LISTENER_SECURITY_PROTOCOL_MAP=PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      - KAFKA_INTER_BROKER_LISTENER_NAME=PLAINTEXT
      - KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1

and finally the python consumer is:最后 python 消费者是:

from kafka import KafkaConsumer
from json import loads

consumer = KafkaConsumer(
    'my-topic',
    bootstrap_servers=['172.28.0.1:9092'],
    auto_offset_reset = 'earliest',
    group_id=None,
)

print('Listening')
for msg in consumer:
    print(msg)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 kafka-python 消费者 RecordTooLarge 错误 - kafka-python consumer RecordTooLarge error 使用 Kafka-python 处理生产者和消费者 - Handling a producer and consumer using Kafka-python 使用 kafka-python 检索主题中的消息 - retrieve messages in a topic using kafka-python 我如何使用kafka-python成为批处理使用者? - How can I make a batch consumer with kafka-python? Kafka Consumer 使用 python 轮询消息 - Kafka Consumer poll messages with python kafka-python:消费者 object 可以在不订阅任何主题的情况下加入消费者组吗? - kafka-python: Can a consumer object join a consumer group without subscribing to any topics? 通过 kafka-python 库检查 python 中是否存在 kafka 主题,不使用消费者和 shell 命令 - Check whether kafka topic exists or not in python via kafka-python libraries and without using consumer and shell commands 无法使用Kafka-Python的反序列化器从Kafka消耗JSON消息 - Can't Consume JSON Messages From Kafka Using Kafka-Python's Deserializer 在kafka-python中使用Debezium时无法使用来自kafka的消息 - Cannot Consume Messages from kafka when using Debezium in kafka-python 由于 KafkaTimeoutError,无法使用 kafka-python 从 django 应用程序向 kafka 发送消息 - Unable to send messages to kafka from django application using kafka-python due to KafkaTimeoutError
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM