Apache Kafka生产者吞吐量和延迟

Question

I made a Kafka Cluster on my local machine and I was testing creating producers with different Throughput to see what happens to the latency.我在我的本地机器上创建了一个 Kafka 集群，我正在测试创建具有不同吞吐量的生产者以查看延迟会发生什么。

I used the kafka-test-perf benchmark to these tests https://docs.cloudera.com/runtime/7.2.10/kafka-managing/topics/kafka-manage-cli-perf-test.html我使用 kafka-test-perf 基准测试这些测试https://docs.cloudera.com/runtime/7.2.10/kafka-managing/topics/kafka-manage-cli-perf-test.html

I made different tests changing the throughput for the kafka producer.我做了不同的测试来改变 kafka 生产者的吞吐量。

Test 1: 2 Throughput
Test 2: 200 Throughput
Test 3: 2,000 Throughput
Test 4: 20,000 Throughput
Test 5: 200,000 Throughput

Throughput for Kafka Producer Kafka 生产者的吞吐量

For my perspective the throughput is the number of messages that arrive in a given amount of time.在我看来，吞吐量是在给定时间内到达的消息数。

For all tests the throughput it´s equal to the records sent by sec, except for Test 5, where the records sent by sec is 22k records/sec.对于所有测试，吞吐量等于秒发送的记录，除了测试 5，其中秒发送的记录为 22k 记录/秒。 Does this mean that my producer can not handle this type of throughput?这是否意味着我的生产者无法处理这种类型的吞吐量？

I am trying to understand the meaning of this.我试图理解这句话的意思。

I ran a lot of tests.我进行了很多测试。

Answer 1

I don't see big difference between test 4 and test 5 which means that you reached the maximum throughput for the given hardware configuration or you need to properly tune Kafka for high loads enter link description here我没有看到测试 4 和测试 5 之间有什么大的区别，这意味着您达到了给定硬件配置的最大吞吐量，或者您需要针对高负载正确调整 Kafka 在此处输入链接描述
Running load generator and the application under test on the same machine is not the best idea due to race conditions .由于竞争条件，在同一台机器上运行 Load Generator 和被测应用程序并不是最好的主意。 Also using a dedicated load testing tool like Apache JMeter can give you better control over the workload model and reporting另外使用专用的负载测试工具，如Apache JMeter可以让你更好地控制工作负载 model 和报告
Running performance tests against scaled down environment won't tell you the full story and you won't be able to extrapolate the results especially for complex applications like Kafka, you need to run your tests against production or production-like environment, this way you will be able to get accurate metrics针对缩小的环境运行性能测试不会告诉你完整的故事，你将无法推断结果，特别是对于像 Kafka 这样的复杂应用程序，你需要针对生产或类似生产的环境运行测试，这样你将能够得到准确的指标
I would recommend increasing the load gradually, this way you will be able to correlate the increasing load with increasing throughput, will be able to determine the saturation point and thebottleneck more precisely.我建议逐渐增加负载，这样您就可以将增加的负载与增加的吞吐量相关联，从而能够更准确地确定饱和点和瓶颈。

Apache Kafka生产者吞吐量和延迟

问题描述

1 个解决方案

解决方案1
0 已采纳 2022-11-24 10:24:07

Apache Kafka生产者吞吐量和延迟

问题描述

1 个解决方案

解决方案1 0 已采纳 2022-11-24 10:24:07

解决方案1
0 已采纳 2022-11-24 10:24:07