简体   繁体   English

GCP PubSub java 客户端的发布者 - 并发不起作用多线程环境

[英]GCP PubSub java client's Publisher - concurrency not working Multi threaded environement

I am trying to consume messages from Kafka and publish them to Google PubSub.我正在尝试使用来自 Kafka 的消息并将它们发布到 Google PubSub。 We have 4 Kafka concurrent consumer threads and I injected the Google pubsub client's Publisher.我们有 4 个 Kafka 并发消费者线程,我注入了 Google pubsub 客户端的发布者。 But the Publisher is not working concurrently and becomes a single threaded.但是 Publisher 不是并发工作的,而是变成了单线程。

But with the same publisher settings if I run a program outside kafka consumer, I am able publish more than 3000 message per sec.但是如果我在 kafka 消费者之外运行一个程序,使用相同的发布者设置,我每秒可以发布超过 3000 条消息。

Here is my Publisher Bean Injected to Kafka consumer.这是我注入到 Kafka 消费者的 Publisher Bean。 I am using default batchSettings and setting number of threads as 10. I played with different number, but for Kafka thread seems to be forcing Publisher to use just one thread.我正在使用默认的 batchSettings 并将线程数设置为 10。我使用了不同的数字,但是对于 Kafka 线程似乎是在强制 Publisher 只使用一个线程。

Publisher bean code:发布者 bean 代码:


    ExecutorProvider executorProvider = InstantiatingExecutorProvider.newBuilder()
                .setExecutorThreadCount(10)
                .build();
    
    Publisher publisher = Publisher.newBuilder(TopicName.of(projectId, topicId))
                .setExecutorProvider(executorProvider)
                .build();

My Kafka Consumer:我的卡夫卡消费者:

 

  public class MyKafkaListenerForPubSub {
      
    private final Publisher publisher;
    
      public MyKafkaListenerForPubSub(Publisher publisher) {
        this.publisher = publisher;
      }
    
      @KafkaListener(topics = "topicName", containerFactory = "kafkaListenerContainerFactory")
      public void processMessage(String content) throws ExecutionException, InterruptedException {
          ByteString messageData = ByteString.copyFromUtf8(content);
          PubsubMessage pubsubMessage = PubsubMessage.newBuilder().setData(messageData).build();
          ApiFuture<String> messageIdFuture = publisher.publish(pubsubMessage);
      }
    }

I tried with my own executor service but It did the same.我尝试使用我自己的执行程序服务,但它做了同样的事情。

I am using the below google pubsub library in a non spring boot app but it uses spring core.我在非 spring 引导应用程序中使用下面的 google pubsub 库,但它使用 spring 核心。

<dependency>
   <groupId>com.google.cloud</groupId>
   <artifactId>google-cloud-pubsub</artifactId>
   <version>1.116.3</version>
</dependency>

Here is a simple program to replicate the issue.这是一个复制问题的简单程序。 If we use the Publisher in the executor service it is not able to push the messages.如果我们在执行器服务中使用 Publisher,它就无法推送消息。


import com.google.api.core.ApiFuture;
import com.google.api.gax.core.FixedCredentialsProvider;
import com.google.auth.oauth2.GoogleCredentials;
import com.google.cloud.pubsub.v1.Publisher;
import com.google.protobuf.ByteString;
import com.google.pubsub.v1.PubsubMessage;
import com.google.pubsub.v1.TopicName;

import java.io.ByteArrayInputStream;
import java.util.Base64;
import java.util.Date;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.stream.IntStream;

public class GCPPublishExample {
  private static String b64Val = "GCP CREDS BASE 64 value";
  private static String bigMessage = "Some Big Message of size 30 KB";
  private static String projectId = "gcp project Id";
  private static String topicId = "some-topic";

  private static void addShutdownHook(){
    Runtime.getRuntime().addShutdownHook(new Thread()
    {
      public void run()
      {
        System.out.println("Program Completed at: " + new Date());
      }
    });
  }

  public static void main(String[] args) throws Exception {
    addShutdownHook();
    System.out.println("Program Started at: " + new Date());

    final boolean parallel = true;  //false to test without threads
    final int totalRecordsToSend = 100_000; //We should change this number to big if we need
    final int executorSize = 4; //As we have 4 kafka concurrent consumers

    System.out.println("parallel: " + parallel + " totalRecordsToSend: " + totalRecordsToSend + " executorSize: " + executorSize);

    GoogleCredentials creds = GoogleCredentials.fromStream(new ByteArrayInputStream(Base64.getDecoder().decode(b64Val)));

    //Using Default batchingSettings and Default executorProvider
    final Publisher publisher = Publisher.newBuilder(TopicName.of(projectId, topicId))
            .setCredentialsProvider(FixedCredentialsProvider.create(creds))
            .build();

    if(parallel) {
      System.out.println("= PUBLISH IN PARALLEL =");
      ExecutorService executorService =  Executors.newFixedThreadPool(executorSize);
      
      IntStream.range(0, totalRecordsToSend).
              forEach(i -> {executorService.execute(new MyGCPPublishRunnable(publisher, bigMessage));});

      executorService.shutdown();
    }
    else {
      System.out.println("= RUNNING IN SEQUENCE =");
      IntStream.range(0, totalRecordsToSend).
              forEach(i -> {publishMessage(publisher, bigMessage);});
    }
  }

  private static void publishMessage(Publisher publisher, String message) {
    try {
      ByteString messageData = ByteString.copyFromUtf8(message);
      PubsubMessage pubsubMessage = PubsubMessage.newBuilder().setData(messageData).build();
      ApiFuture<String> messageIdFuture = publisher.publish(pubsubMessage);
      //System.out.println("Message sent successfully, messageId: {}" + messageIdFuture.get() + " Thread: " + Thread.currentThread().getId());
    } catch (Exception e) {
      e.printStackTrace();
    }
  }
}

class MyGCPPublishTask implements Runnable {
  private final Publisher publisher;
  private final String message;

  public MyGCPPublishTask(Publisher publisher, String message) {
    this.publisher = publisher;
    this.message = message;
  }

  @Override
  public void run() {
    try {
      ByteString messageData = ByteString.copyFromUtf8(this.message);
      PubsubMessage pubsubMessage = PubsubMessage.newBuilder().setData(messageData).build();
      ApiFuture<String> messageIdFuture = this.publisher.publish(pubsubMessage);
      //System.out.println("Message sent successfully, messageId: " + messageIdFuture.get());
    } catch (Exception e) {
      e.printStackTrace();
    }
  }
}

I had debug statement to log the messageId from GCP that made performance issues as it was waiting for the GCP to response back.我有调试语句来记录来自 GCP 的 messageId,它在等待 GCP 响应时会产生性能问题。 After I comment that it worked fine.在我评论说它工作正常之后。

LOGGER.debug("Message sent successfully, messageId: {}", messageIdFuture.get()); LOGGER.debug("消息发送成功,messageId: {}", messageIdFuture.get());

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM