简体   繁体   English

kafka-python 消费者 RecordTooLarge 错误

[英]kafka-python consumer RecordTooLarge error

I am using a Kafka consumer but when running it and trying to fetch 1000 messages I get the following error:我正在使用 Kafka 消费者,但在运行它并尝试获取 1000 条消息时,出现以下错误:

kafka.consumer.fetcher.RecordTooLargeError: RecordTooLargeError: ("There are some messages at [Partition=Offset]: {TopicPartition(topic='stag-client-topic', partition=0): 177} whose size is larger than the fetch size 247483647 and hence cannot be ever returned. Increase the fetch size, or decrease the maximum message size the broker will allow.", {TopicPartition(topic='stag-client-topic', partition=0): 177}) kafka.consumer.fetcher.RecordTooLargeError: RecordTooLargeError: ("There are some messages at [Partition=Offset]: {TopicPartition(topic='stag-client-topic', partition=0): 177} 其大小大于获取大小为 247483647,因此永远无法返回。增加获取大小,或减少代理允许的最大消息大小。", {TopicPartition(topic='stag-client-topic', partition=0): 177})

I am already using the maximum fetch size in the configuration of my consumer.我已经在我的消费者配置中使用了最大获取大小。 Here is the function which defines the consumer这是定义消费者的函数

def kafka_decoder(x, context=dict()):
    try:
        return json.loads(x.decode('utf-8'))
    except json.JSONDecodeError as e:
        return None

def build_consumer(topic, servers, auto_commit, context=dict()):
        try:
            return KafkaConsumer(
                topic,
                bootstrap_servers=servers,
                value_deserializer=lambda value: kafka_decoder(value, context={
                    'event_string': value.decode('utf-8')}),
                key_deserializer=lambda key: key.decode('utf-8'),
                group_id='client-',
                api_version=(0, 10, 1),
                enable_auto_commit=auto_commit,
                auto_offset_reset='earliest',
                request_timeout_ms=30000,
                security_protocol='SASL_SSL',
                max_partition_fetch_bytes=247483647,
                max_poll_records=10000,
                fetch_max_wait_ms=4000,
                fetch_max_bytes=247483647,
                sasl_mechanism='PLAIN',
                ssl_check_hostname = False,
                sasl_plain_username='usrname',
                sasl_plain_password='somepsswrd')
        except Exception:
            print('Error in Kafka consumer creation')

Does anyone have any suggestions on how to proceed here?有没有人对如何在这里进行任何建议?

It's failing on one offset, not just getting 1000 records (your poll size is 10,000 anyway)它在一个偏移上失败,而不仅仅是获得 1000 条记录(无论如何,您的投票大小是 10,000)

You need to increase max_partition_fetch_bytes=247483647 and fetch_max_bytes=247483647您需要增加max_partition_fetch_bytes=247483647fetch_max_bytes=247483647

And you may also want to adjust the max sizes of records kafka can hold.而且您可能还想调整 kafka 可以容纳的最大记录大小。

How can I send large messages with Kafka (over 15MB)? 如何使用 Kafka 发送大消息(超过 15MB)?

Overall, though, you really should avoid putting over 236 MB of data in a single record to begin with不过,总的来说,您确实应该避免将超过 236 MB 的数据放在单个记录中

由于错误说“其大小大于提取大小 247483647”,您应该首先添加 1024*1024 个字节,例如 max_partition_fetch_bytes=248532223 等等,如果它仍然更少

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM