简体   繁体   English

Kafka AdminClient 性能注意事项

[英]Kafka AdminClient performance considerations

I would like to dynamically create Kafka topics.我想动态创建 Kafka 主题。 In my case, there can be up to several hundred topics in the application.就我而言,应用程序中最多可以有数百个主题。 There can be multiple concurrent calls to this method for each topic during system startup.在系统启动期间,每个主题可以多次并发调用此方法。

The AdminClient object has local scope, so it will be created every time. AdminClient object本地有scope,所以每次都会创建。 I suspect that a socket and a connection to the Kafka broker are opened underneath, so this solution is not optimal in terms of performance, as there may be several hundred connections open in memory at any one time.我怀疑下面打开了一个套接字和一个到 Kafka 代理的连接,所以这个解决方案在性能方面不是最优的,因为 memory 中可能随时打开数百个连接。

import java.util.Collections;
import java.util.Properties;
import java.util.Set;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ExecutionException;
import lombok.RequiredArgsConstructor;
import org.apache.kafka.clients.admin.Admin;
import org.apache.kafka.clients.admin.AdminClientConfig;
import org.apache.kafka.clients.admin.CreateTopicsResult;
import org.apache.kafka.clients.admin.NewTopic;
import org.apache.kafka.common.KafkaFuture;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Service;

@Service
@RequiredArgsConstructor
class TopicFactory {

   private final Logger log = LoggerFactory.getLogger(TopicFactory.class);

   private final Set<String> topics = ConcurrentHashMap.newKeySet();

   @Value("${kafka.bootstrap.servers}")
   private final String bootstrapServers;

   @Value("${kafka.topic.replication.factor}")
   private final String replicationFactor;
   
   void createTopicIfNotExists(String topicName, int partitionCount) {
      if (topics.contains(topicName)) {
         return;
      }
      Properties properties = new Properties();
      properties.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
      try (Admin admin = Admin.create(properties)) {
         if (isTopicExists(admin, topicName)) {
            topics.add(topicName);
            return;
         }
         NewTopic newTopic = new NewTopic(topicName, partitionCount, Short.parseShort(replicationFactor));
         CreateTopicsResult result = admin.createTopics(Collections.singleton(newTopic));
         KafkaFuture<Void> future = result.values().get(topicName);
         try {
            future.get();
            topics.add(topicName);
         } catch (InterruptedException e) {
            Thread.currentThread().interrupt();
            log.error("Interrupted exception occurred during topic creation", e);
         } catch (ExecutionException e) {
            log.error("Execution exception occurred during topic creation", e);
         }
      }
   }

   private boolean isTopicExists(Admin admin, String topicName) {
      try {
         return admin.listTopics().names().get().contains(topicName);
      } catch (InterruptedException e) {
         Thread.currentThread().interrupt();
         log.error("Interrupted exception occurred during topic creation", e);
         return false;
      } catch (ExecutionException e) {
         log.error("Execution exception occurred during topic creation", e);
         return false;
      }
   }
}

How to improve the performance of this solution?如何提高这个解决方案的性能? Is connection caching a good idea?连接缓存是个好主意吗? If so, in what way?如果真是这样,那么是以哪种方式? As an initialized field in a class or maybe using eg Guava cache or Suppliers.memoize(...) ?作为 class 中的初始化字段,或者可能使用例如 Guava 缓存或Suppliers.memoize(...) However, then the connection with the broker would have to be maintained all the time.然而,与经纪人的联系必须始终保持。

If you want to improve this solution for hundreds of topics, as it is written, then admin.createTopics takes a whole collection, so don't use a singleton list.如果您想针对数百个主题改进此解决方案,那么admin.createTopics需要整个集合,所以不要使用 singleton 列表。

Also, admin.listTopics() result can be cached so that you don't query all topics every time you create one more topic.此外,可以缓存admin.listTopics()结果,这样您就不会在每次创建一个主题时都查询所有主题。

Otherwise, I personally would use alternative solutions like Terraform rather than Spring. Since topics aren't going to need to be recreated very often (in the same Kafka cluster, at least), so your code might only be ran a handful of times, but you're needlessly increasing the size of your Spring app by dragging that TopicFactory class around.否则,我个人会使用替代解决方案,例如 Terraform 而不是 Spring。由于不需要经常重新创建主题(至少在同一个 Kafka 集群中),因此您的代码可能只运行几次,但是您通过拖动 TopicFactory class 来不必要地增加 Spring 应用程序的大小。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM