We have been using the MongoDB API with CosmosDB (Server v3.6) quite extensively the last few months via .NET Core and latest MongoDB.Driver Nuget package (2.11.0).
Bulk Inserts and inserts work fine, but unfortunately, I cannot get bulk operations to work with the IsUpsert=true
mode.
Note:
- We use
Polly
to manage rate limiting. As part of this, we handleMongoWriteException, MongoExecutionTimeoutException, MongoCommandException
andMongoBulkWriteExceptions
.- This issue can be observed for both sharded/non-sharded collections.
Specifically, given a list of non-sharded input documents List<T> documents
, the following works fine:
Bulk Insert:
await Collection.BulkWriteAsync(documents.Select(s => new InsertOneModel<T>(s)),...)
Bulk Update:
await Collection.BulkWriteAsync(documents.Select(s => new ReplaceOneModel<T>(Builders<T>.Filter.Eq("Id", item.Id), item) { IsUpsert = false }),...)
Unfortunately, if some of the documents are new documents, we should be able to use the bulk update code above as is - but simply set the IsUpsert flag to true... but alas, this isn't working.
Specifically, given 50 existing documents and 50 new documents :
ObjectId
as primary key, for the first new document it processes, CosmosDb will incorrectly insert it with Id=ObjectId("000000000000000000000000")
- and at that point no further documents will be inserted/updated. In this scenario:
BulkWriteResult
returned with MatchedCount=65, ModifiedCount=65, ProcessedRequests=100, RequestCount=100, Upserts=1, IsAcknowledged=true, IsModifiedCountAvailable=true, InsertedCount=0
BulkWriteResult
int
as primary key then cosmos db seems to
BulkWriteResult
returns with MatchedCount=50, ModifiedCount=50, ProcessedRequests=100, RequestCount=100, Upserts=8, IsAcknowledged=true, IsModifiedCountAvailable=true, InsertedCount=0
. What am I missing? The ObjectId
scenario seems totally broken; the other scenario could be coded around but it doesn't seem correct that no exception raised here.
For anyone else plaqued by this issue - the workarounds were far from straightforward, but here's what I ended up doing.
ObjectId
: This scenario works consistently in in both CosmosDB
and MongoDB
as long as you either use 'InsertOneModel<>' or ReplaceOneModel<>
based on the value of the identifier being ObjectId.Empty
or not. However, you may still have to deal with below mentioned off by one error .ObjectId
: Definitely a bug in CosmosDb
as I couldn't reproduce this scenario in official MongoDB
implementation. To fix this, I had to apply the following two workarounds:
Polly
policies to retry the unprocessed requests just like how I normally do with the other MongoDB
rate limiting exceptions typically thrown by CosmosDB
. Sample code: BulkWriteResult<T> bulkWriteResult = await Collection.BulkWriteAsync( remainingWork, new BulkWriteOptions { BypassDocumentValidation = true }, token); var actuallyProcessed = bulkWriteResult.DeletedCount + bulkWriteResult.InsertedCount + bulkWriteResult.ModifiedCount + bulkWriteResult.Upserts?.Count; if (actuallyProcessed < bulkWriteResult.ProcessedRequests.Count) { // Off by one error: OCCASIONALLY, the last one processed is not actually processed // No way to detect this, unfortunately - hence the adjustment by 1 actuallyProcessed = actuallyProcessed > 1? actuallyProcessed - 1: 0; var processed = bulkWriteResult.ProcessedRequests.Take((int)actuallyProcessed).ToList().AsReadOnly(); var unprocessed = bulkWriteResult.ProcessedRequests.Skip((int)actuallyProcessed).ToList().AsReadOnly(); throw new CosmosDbRateLimitingBugException<T>(unprocessed, processed, bulkWriteResult); }
MongoDB
implementations, but just like above, you also have to sometimes adjust processed records by 1. Note: This issue apply regardless of using 'IsUpsert=true'. Below code is a slightly simplified as I use Polly.Context
to keep track of exceptions and processed/unprocessed records (not shown). Here remainingWork
is the WriteModel<T>
requests that have to be issued to next BulkWriteAsync<>
call.if (exception is MongoBulkWriteException<T> mostRecentException)
{
var unProcessedRequests =
mostRecentException.UnprocessedRequests.ToList();
if (mostRecentException.WriteErrors.Any())
{
//get processed requests (without success) that failed and add to remainingWork
var requestWithError = new[]
{
mostRecentException.Result.ProcessedRequests[
mostRecentException.WriteErrors[0].Index]
};
unProcessedRequests = unProcessedRequests.Concat(requestWithError).ToList();
}
remainingWork = unProcessedRequests.ToList();
}
else if (exception is CosmosDbRateLimitingBugException<T> cosmosDbBug)
{
remainingWork = cosmosDbBug.UnprocessedRequests;
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.