简体   繁体   中英

google cloud pubsub node.js client not compatible with google cloud functions

Architecture:

We have an architecture using 2 pubsub topic/subscription pairs:

  • Topic T1 is triggered by a cronjob periodically (every 5 minutes for example). Subscription S1 is the trigger for our cloud function.
  • Topic T2 serves as a queue for background jobs that are published by one of our services. Subscription S2 is read by the cloud function on each execution to service the queued background jobs.

This allows us to control the frequency the background jobs are serviced independent of when they are added to the queue.

The cloud function (triggered by S1 ) reads messages from S2 by pulling . It decides which background jobs are ready and upon successfully servicing the job, it ACK's the associated messages. Jobs not ready or failed are not ACK'ed to be serviced later.

Issue:

We have issues using the official node.js pubusb client from google:

  1. Sometimes ACK'ed messages re-appear (seeming infinitely). We verified the messages are acked before the ACK deadline and are sure we are calling ack() by investigating our logs.
  2. Sometimes after the first execution (after re-deploying the function), subsequent executions never receive new messages. We can verify the messages are queued in subscription S2 either by verifying the unacknowledged message count in stackdriver or by re-deploying the function and seeing the messages getting serviced.

We believe this is a problem with google's node.js pubsub client. The cloud function docs clearly state not start background activities . However, looking into the node.js pubsub client source, it clearly services acknowledgements in the background using timeouts.

Is google's node.js pubsub client not compatible with google cloud functions? Google recommends accessing the service API's only when a client library does not exist or does not meet other needs . Is running the client in a cloud function "other needs", requiring us to write our own client using the service API's?

Workaround attempted:

As a "workaround" we tried delaying the end of the execution of the cloudfunction to allow any "background" processes in the node.js pubsub client to complete, but this did not consistently eliminate our issue. It seems that pubsub client is not cloud function friendly and cannot recover from being stopped in between cloud function executions.

Update Feb. 22, 2018

I wrote an article on our blog that describes in detail why we used PubSub in this way and how we are working around the fact that node.js pubsub client is not compatible with cloud functions.

How are you triggering your functions?

According to the docs , if your function is consuming pubsub messages, then you should use the pubsub trigger. When using the pubsub trigger, the library is not needed. Simply call callback() at the end of your function, and the pubsub message will be properly acknowledged.

For what you intend to do, I don't think your current architecture is the proper option.

I would move your first step to Google App Engine with a cron task , and making this task simply move messages from T2 to T1 , leaving the function having the trigger S2 and processing the message.

So, your jobs would be published on T2 , and you'd have a GAE app with a pull subscription S2 triggered by a cron task, and this app would re-publish the message to T1 . Then your function would be triggered by a subscription S1 to topic T1 , and would run the job in the message, avoiding the extra-processing of importing the pubsub library, and using the product as expected.

Furthermore, I'm not sure how you are originally publishing your jobs to the topic, but Task Queues are a good GAE (and product-agnostic in Alpha ) option for rate-limiting tasks.

A GAE app only used for this (setting a 1 max instance) would be within the always free limit , so costs would not be noticeably increased.

A developer from the node.js pubsub client confirmed that using the client to pull messages from a Cloud Function is not a supported use case.

The alternative is to use the service APIs . However, the REST APIs have their own caveats when attempting to pull all messages from a subscription.

I ran into the same problem, I wanted better control over .ack() . Looking at the nodejs library from google, it would an option to refactor the ack() to return a promise so the function can wait for ack() to complete.

Subscriber.prototype.ack_ = function(message) {
  var breakLease = this.breakLease_.bind(this, message);

  this.histogram.add(Date.now() - message.received);

  if (this.writeToStreams_ && this.isConnected_()) {
    this.acknowledge_(message.ackId, message.connectionId).then(breakLease);
    return;
  }

  this.inventory_.ack.push(message.ackId);
  this.setFlushTimeout_().then(breakLease);
};

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM