简体   繁体   中英

Google pub/sub ERROR com.google.cloud.pubsub.v1.StreamingSubscriberConnection

I have a snowplow enricher application hosted in GKE consuming messages from google pub/sub subscription and the enricher application is throwing the below error.

I can see num_undelivered_messages count spiking(going above 50000) in the pub/sub subscription 3-4 times a day and i presume these error messages are occurring as enricher application is unable to fetch messages from the mentioned subscription.

Why is the application unable to connect to pub/sub subscription at times?

Any help is really appreciated.

Apr 12, 2022 12:30:32 PM com.google.cloud.pubsub.v1.StreamingSubscriberConnection$2 onFailure
WARNING: failed to send operations
com.google.api.gax.rpc.UnavailableException: io.grpc.StatusRuntimeException: UNAVAILABLE: 502:Bad Gateway
at com.google.api.gax.rpc.ApiExceptionFactory.createException(ApiExceptionFactory.java:69)
at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:72)
at com.google.api.gax.grpc.GrpcApiExceptionFactory.create(GrpcApiExceptionFactory.java:60)
at com.google.api.gax.grpc.GrpcExceptionCallable$ExceptionTransformingFuture.onFailure(GrpcExceptionCallable.java:97)
at com.google.api.core.ApiFutures$1.onFailure(ApiFutures.java:68)
at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1050)
at com.google.common.util.concurrent.DirectExecutor.execute(DirectExecutor.java:30)
at com.google.common.util.concurrent.AbstractFuture.executeListener(AbstractFuture.java:1176)
at com.google.common.util.concurrent.AbstractFuture.complete(AbstractFuture.java:969)
at com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:760)
at io.grpc.stub.ClientCalls$GrpcFuture.setException(ClientCalls.java:545)
at io.grpc.stub.ClientCalls$UnaryStreamToFuture.onClose(ClientCalls.java:515)
at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:426)
at io.grpc.internal.ClientCallImpl.access$500(ClientCallImpl.java:66)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:689)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$900(ClientCallImpl.java:577)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:751)
at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:740)
at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: io.grpc.StatusRuntimeException: UNAVAILABLE: 502:Bad Gateway
at io.grpc.Status.asRuntimeException(Status.java:533)
... 15 more

The accumulation of messages in the subscriptions suggests that your subscribers are not keeping up with the flow of messages.

To monitor your subscribers, you can create a dashboard that contains backlog metrics: num_undelivered_messages and oldest_unacked_message_age (age of the oldest unacked message in the subscription's backlog) aggregated by resource for all your subscription.

  • If both the oldest_unacked_message and num_undelivered_messages are growing it is because the subscribers are not keeping up with the message volume.

    Solution: Add more subscriber threads/ machines and look for any bugs in your code which might prevent acknowledging messages.

  • If there is a steady, small backlog size with a steadily growing oldest_unacked_message_age , there may be a small number of messages that cannot be processed. This can be due to the messages getting stuck.

    Solution: Check your application logs to understand whether some messages are causing your code to crash. It's unlikely—but possible —that the offending messages are stuck on Pub/Sub rather than in your client.

  • If the oldest_unacked_message_age exceeds the subscription's message retention duration there are high chances of data loss; in that case the best option is to set up alerts to fire before subscription's message retention duration lapses.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM