Pub/Sub pull request count drastically decreases on kubernetes pods of gcp

Question

I have 5M~ messages (total 7GB~) on my backlog gcp pub/sub subscription and want to pull as many as possible of them. I am using synchronous pull with settings below and waiting for 3 minutes to pile up messages and sent to another db.

    defaultSettings := &pubsub.ReceiveSettings{
        MaxExtension:           10 * time.Minute,
        MaxOutstandingMessages: 100000,
        MaxOutstandingBytes:    128e6, // 128 MB
        NumGoroutines:          1,
        Synchronous:            true,
    }

Problem is that if I have around 5 pods on my kube.netes cluster pods are able to pull nearly 90k~ messages almost in each round (3 minutes period).However, when I increase the number of pods to 20 in the first or the second round each pods able to retrieve 90k~ messages however after a while somehow pull request count drastically drops and each pods receives 1k-5k~ messages in each round. I have investigated the go library sync pull mechanism and know that without acking successfully messages you are not able to request for new ones so pull request count may drop to prevent exceed MaxOutstandingMessages but I am scaling down to zero my pods to start fresh pods while there are still millions of unacked messages in my subscription and they still gets very low number of messages in 3 minutes with 5 or 20 pods does not matter. After around 20-30 minutes they receives again 90k~ messages each and then again drops to very low levels after a while (checking from metrics page). Another interesting thing is that while my fresh pods receives very low number of messages, my local computer connected to same subscription gets 90k~ messages in each round.

I have read the quotas and limits page of pubsub, bandwith quotas are extremely high (240,000,000 kB per minute (4 GB/s) in large regions). I tried a lot of things but couldn't understand why pull request counts drops massively in case I am starting fresh pods. Is there some connection or bandwith limitation for kube.netes cluster nodes on gcp or on pub/sub side? Receiving messages in high volume is critical for my task.

Answer 1

If you are using synchronous pull, I suggest using StreamingPull for your scale Pub/Sub usage.

Note that to achieve low message delivery latency with synchronous pull, it is important to have many simultaneously outstanding pull requests. As the throughput of the topic increases, more pull requests are necessary. In general, asynchronous pull is preferable for latency-sensitive applications.

It is expected that, for a high throughput scenario and synchronous pull, there should always be many idle requests.

A synchronous pull request establishes a connection to one specific server (process). A high throughput topic is handled by many servers. Messages coming in will go to only a few servers, from 3 to 5. Those servers should have an idle process already connected, to be able to quickly forward messages.

The process conflicts with CPU based scaling. Idle connections don't cause CPU load. At least, there should be many more threads per pod than 10 to make CPU-based scaling work.

Also, you can use Horizontal-Pod-Autoscaler(HPA) configured for Pub/Sub consuming GKE pods. With the HPA, you can configure CPU usage.

My last recommendation would be to consider Dataflow for your workload. Consuming from PubSub.

Pub/Sub pull request count drastically decreases on kubernetes pods of gcp

Question

1 answers

solution1
1 2022-03-15 19:20:04

Pub/Sub pull request count drastically decreases on kubernetes pods of gcp

Question

1 answers

solution1 1 2022-03-15 19:20:04

solution1
1 2022-03-15 19:20:04