简体   繁体   中英

Get resume token with MongoDB Java driver before first document received in ChangeStream?

This question is similar to How do I resume a MongoDB ChangeStream at the first document and not just changes after I start listening but for the Java driver. This is crucial, afaik, if one needs to make sure that all documents are processed at least once.

For example, let's say that I have a change stream (C) that subscribes to documents and sends an email based on the contents of each document. But if the email sending fails or the server crashes before the email could be sent then the resume token (R) will not have been persisted. When the application is started up again it'll "watch" without a resume token and thus the document will be missed and no email sent.

Is there a supported way to get the resume token of a ChangeStream before the first change document has been received to mitigate the issue described above?

From what I can tell from the MongoDB specification this must be supported by drivers:

Drivers MUST expose a mechanism to retrieve the same resume token that would be used to automatically resume.

But I cannot seem to find a way to do this using the Java API. Is this possible or is there a recommended workaround?

Note that I would very much prefer not to use startAtOperationTime which is based on a timestamp since time is fragile and clocks may be changed both on the server and client.

In a 4.2-compatible driver implementing the "must expose resume token" provision of the specification, each time the change stream executes a getMore, one of two things happens:

  • Either at least one document is returned, with each document containing a resume token at that document, or
  • No documents are returned, in which case postBatchResumeToken is still provided by 4.0.7+ servers.

As I recall in Java change streams have a tryNext method, you need to call that to retrieve postBatchResumeToken without blocking the application. The mechanism for retrieving the current resume token (either one associated with a document or postBatchResumeToken) is driver-specific.

https://mongodb.github.io/mongo-java-driver/4.0/apidocs/mongodb-driver-sync/com/mongodb/client/MongoChangeStreamCursor.html is the closest documentation I can find, except I believe you would use tryNext instead of next, and if tryNext doesn't return any documents you would still read the current resume token to advance your position in the change stream.

https://docs.mongodb.com/ruby-driver/master/tutorials/ruby-driver-change-streams/#resuming-a-change-stream may be helpful as far as resume token tracking in general although this doesn't include try_next (which Ruby driver also implements) as would be needed.

This would allow you to correctly resume change stream before it received any documents. You would store the resume token after processing documents, so you need to make progress quickly enough that you don't fall off the oplog, but postBatchResumeToken handles the case of there not being any changes for a long time without falling off the oplog.

You still need to start the change stream at a timestamp in the very beginning, if you do not have any resume tokens - https://mongodb.github.io/mongo-java-driver/4.0/apidocs/mongodb-driver-sync/com/mongodb/client/ChangeStreamIterable.html gives startAtOperationTime as the method I'd expect you would use. You could potentially provide the current clusterTime as tracked by the driver, if your driver exposes that.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM