[英]Sending messages to Kafka unbuffered using kafkacat
I have single node Kafka instance running locally via docker-compose.我有一个通过 docker-compose 在本地运行的单节点 Kafka 实例。
(system: Mac/Arm64, image: wurstmeister/kafka:2.13-2.6.0) (系统:Mac/Arm64,图像:wurstmeister/kafka:2.13-2.6.0)
I want to use kafkacat
( kcat
installed via Homebrew) to instantly produce and consume messages to and from Kafka.我想使用kafkacat
(通过 Homebrew 安装的kcat
)即时生成和使用来自 Kafka 的消息。
Here is a minimal script:这是一个最小的脚本:
#!/usr/bin/env bash
NUM_MESSAGES=${1:-3} # use arg1 or use default=3
KCAT_ARGS="-q -u -c $NUM_MESSAGES -b localhost:9092 -t unbuffered"
log() { echo "$*" 1>&2; }
producer() {
log "starting producer"
for i in `seq 1 3`; do
echo "msg $i"
log "produced: msg $i"
sleep 1
done | kcat $KCAT_ARGS -P
}
consumer() {
log "starting consumer"
kcat $KCAT_ARGS -C -o end | while read line; do
log "consumed: $line"
done
}
producer&
consumer&
wait
I would expect (roughly) the following output:我希望(大致)以下 output:
starting producer
starting consumer
produced: msg 1
consumed: msg 1
produced: msg 2
consumed: msg 2
produced: msg 3
consumed: msg 3
However, I only get output with produced and consumed messages fully batched into two groups, even though both the consumer
and producer
are running in parallel:但是,我只得到 output,其中生产消息和消费消息完全分为两组,即使consumer
和producer
都在并行运行:
starting producer
starting consumer
produced: msg 1
produced: msg 2
produced: msg 3
consumed: msg 1
consumed: msg 2
consumed: msg 3
Here are some kafkacat/kafka producer properties and the values I already tried to change the producer behavior.这里有一些 kafkacat/kafka 生产者属性和我已经尝试改变生产者行为的值。
# kcat options having no effect on the test case
-u # unbuffered output
-T # act like `tee` and echo input
# kafka properties having no effect on the test case
-X queue.buffering.max.messages=1
-X queue.buffering.max.kbytes=1
-X batch.num.messages=1
-X queue.buffering.max.ms=100
-X socket.timeout.ms=100
-X max.in.flight.requests.per.connection=1
-X auto.commit.interval.ms=100
-X request.timeout.ms=100
-X message.timeout.ms=100
-X offset.store.sync.interval.ms=1
-X message.copy.max.bytes=100
-X socket.send.buffer.bytes=100
-X linger.ms=1
-X delivery.timeout.ms=100
None of the options above had any effect on the pipeline.上述选项均未对管道产生任何影响。
What am I missing?我错过了什么?
Edit : It seems to be a flushing issue with either kcat or librdkafka.编辑:这似乎是 kcat 或 librdkafka 的冲洗问题。 Maybe the -X
properties are not used correctly.也许-X
属性没有正确使用。
Here are the current observations (will edit them as I learn more):以下是当前的观察结果(当我了解更多信息时将对其进行编辑):
When sending a larger payload of 10000 messages with a smaller delay in the script, kcat
will produce several batches of messages.当在脚本中以较小的延迟发送 10000 条消息的较大负载时, kcat
将生成几批消息。 It seems to be size-based, but not configurable by any of the -X
options.它似乎是基于大小的,但不能通过任何-X
选项进行配置。
The batches are then also correctly picked up by the consumer.然后,消费者也可以正确地提取这些批次。 So it must be a producer issue .所以这一定是生产者的问题。
I also tried the script in docker with the current kafkacat
from the apline repos.我还使用 apline 回购中的当前kafkacat
尝试了 docker 中的脚本。 This one seems to flush a but earlier;这个好像冲的比较早; with less data needed to fill the "hidden" buffer.填充“隐藏”缓冲区所需的数据更少。 The -X
options also had no effect. -X
选项也没有效果。
Also the -X
properties seem to be checked.似乎还检查了-X
属性。 If I set out-of-range values, kcat (or maybe librdkafka) will complain.如果我设置了超出范围的值,kcat(或者 librdkafka)会抱怨。 However, setting low values for any of the timeout and buffer size values has no effect.但是,为任何超时和缓冲区大小值设置较低的值都没有效果。
When calling kcat
for every message (which is a bit of an overkill), the messages are produced instantly.当为每条消息调用kcat
时(这有点矫枉过正),消息会立即生成。
The question remains:问题仍然存在:
How do I tell a Kafka-pipeline to instantly produce my first message?我如何告诉 Kafka 管道立即生成我的第一条消息?
If you have an example in Go, this would also help, since I am having similar observations with a small Go program using kafka-go .如果您在 Go 中有一个示例,这也会有所帮助,因为我对使用kafka-go的小型 Go 程序有类似的观察结果。 I may post a separate question if I can strip that down to a postable format.如果我可以将其分解为可发布的格式,我可能会发布一个单独的问题。
UPDATE : I tried using a bitnami image on a pure Linux host.更新:我尝试在纯 Linux 主机上使用 bitnami 图像。 Producing and consuming via kafkacat
works as expected on this system.通过kafkacat
生产和消费在此系统上按预期工作。 I will post an answer once I know more.一旦我知道更多,我会发布答案。
Here is how I solved the problem.这是我解决问题的方法。
The issue was not in the Kafka docker images.问题不在 Kafka docker 图像中。 They all work as expected, although I was able to crash the Java-based Kafkas by just firing up kcat
against them.它们都按预期工作,尽管我能够通过对它们启动kcat
来使基于 Java 的 Kafka 崩溃。 I later added rpk
(RedPanda, a non-Java "Kafka"), which was much more stable in my single node setup.我后来添加了rpk
(RedPanda,一个非 Java 的“Kafka”),它在我的单节点设置中更加稳定。
Findings发现
kcat
I did not find any way of producing messages instantly without buffering.使用kcat
我没有找到任何无需缓冲即可立即生成消息的方法。 It notoriously ignores all -X
args.众所周知,它会忽略所有-X
参数。 (edenhill/kcat Version 1.7.0, MacOS, Arm64) (edenhill/kcat 版本 1.7.0,MacOS,Arm64)kcat
will flush the output buffer.当关闭输入 pipe 时, kcat
将刷新 output 缓冲区。kcat
is possible and works by default.通过kcat立即使用kcat
是可能的,并且默认情况下可以工作。Conculsion脑震荡
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.