简体繁体 English

对 Kafka 消费者属性感到困惑

[英]Confused about Kafka Consumer Properties

原文 2021-11-05 19:35:09 5 1 java/ apache-kafka/ kafka-consumer-api

I'm a bit confused by some of the consumer API configuration properties.我对一些消费者 API 配置属性有点困惑。 It seems as though they either conflict, or cancel each other out.似乎它们要么冲突，要么相互抵消。 Can someone help me understand the difference between the following keys.有人可以帮助我了解以下键之间的区别吗？

Definitions :定义：

fetch.max.bytes : Maximum amount of data the server should return for a fetch request fetch.max.bytes ：服务器应为获取请求返回的最大数据量
max.partition.fetch.bytes : Max amount of data per-partition the server will return max.partition.fetch.bytes ：服务器将返回的每个分区的最大数据量
max.poll.records : The maximum number of records returned in a single call to poll() max.poll.records ：在一次调用 poll() 中返回的最大记录数

Example :示例：
fetch.max.bytes : 30000 (30kb) fetch.max.bytes : 30000 (30kb)
max.partition.fetch.bytes : 20000000 (20mb) max.partition.fetch.bytes : 20000000 ( max.partition.fetch.bytes )
max.poll.records : 1000 max.poll.records ：1000

To me it seems like the consumer definition above is saying it can accept up to 20mb of data/partition, but then only specifying max bytes of 30kb which doesn't make sense.对我来说，上面的消费者定义似乎是说它最多可以接受 20mb 的数据/分区，但随后只指定了 30kb 的最大字节，这是没有意义的。 Max poll records also seems to limit data intake since it's possible 1000 is too low or too high based on the size of each record.最大轮询记录似乎也限制了数据摄入量，因为根据每条记录的大小，1000 可能太低或太高。

1 个解决方案

fetch.max.bytes and max.partition.fetch.bytes are fields of Fetch requests sent to Kafka brokers. fetch.max.bytes和max.partition.fetch.bytes是发送到 Kafka 代理的Fetch 请求的字段。 They respectively determine the maximum size of the Fetch response the broker will send and the maximum size of data per partition the broker can return.它们分别决定了代理将发送的 Fetch 响应的最大大小和代理可以返回的每个分区的最大数据大小。 It's the broker that uses these values to compute a Fetch response.代理使用这些值来计算 Fetch 响应。

On the other hand, max.poll.records is a client-side only configuration.另一方面， max.poll.records是客户端唯一的配置。 It determines how many records a call to poll() can return.它确定对poll()的调用可以返回多少条记录。

The consumer will fetch records in the background and buffer them so records are ready when poll() is called.消费者将在后台获取记录并缓冲它们，以便在调用poll()时记录准备就绪。

These settings allow for example to fetch records in batches, which is more efficient, but still pass them to the Consumer application in small chunks or even individually depending on the processing its doing.例如，这些设置允许批量获取记录，这样效率更高，但仍将它们以小块或什至单独的方式传递给 Consumer 应用程序，具体取决于其执行的处理。