简体   繁体   中英

How to Bulk Insert in Cosmos DB with .NET Core 2.1 and Stream API

I'm trying to implement bulk insert with this CosmosDB sample . This sample is created with .NET Core 3.* and support of System.Text.Json.

When using the CreateItemAsync method, it works perfectly:

    var concurrentTasks = new List<Task<ItemResponse<Notification>>>();
    foreach (var entity in entities)
    {
        entity.Id = GenerateId(entity);

        var requestOptions = new ItemRequestOptions();
        requestOptions.EnableContentResponseOnWrite = false; // We don't need to get the entire body returend.
        concurrentTasks.Add(Container.CreateItemAsync(entity, new PartitionKey(entity.UserId), requestOptions));
    }

    await Task.WhenAll(concurrentTasks);

However, I'm trying to see if I can reduce the number of RU's by streaming the data directly into CosmosDB, hoping CosmosDB doesn't charge me for deserializing JSON itself.

I'm working in .NET Core 2.1 and Newtonsoft.Json. This is my code that does not return a succesfull status code. The sub-status code in the response header is "0".

    Notification[] notifications = entities.ToArray();
    var itemsToInsert = new Dictionary<PartitionKey, Stream>();

    foreach (var notification in notifications)
    {
        MemoryStream ms = new MemoryStream();
        StreamWriter writer = new StreamWriter(ms);
        JsonTextWriter jsonWriter = new JsonTextWriter(writer);
        JsonSerializer ser = new JsonSerializer();
                
        ser.Serialize(jsonWriter, notification);

        await jsonWriter.FlushAsync();
        await writer.FlushAsync();

        itemsToInsert.Add(new PartitionKey(notification.UserId), ms);
    }

    List<Task> tasks = new List<Task>(notifications.Length);
    foreach (KeyValuePair<PartitionKey, Stream> item in itemsToInsert)
    {
        tasks.Add(Container.CreateItemStreamAsync(item.Value, item.Key)
            .ContinueWith((Task<ResponseMessage> task) =>
            {
                using (ResponseMessage response = task.Result)
                {
                    if (!response.IsSuccessStatusCode)
                    {
                        Console.WriteLine($"Received {response.StatusCode} ({response.ErrorMessage}).");
                    }
                    else
                    {
                    }
                }
            }));
    }

    // Wait until all are done
    await Task.WhenAll(tasks);

response.StatusCode: BadRequest response.ErrorMessage: null

I'm assuming I don't serialize into the Stream in a correct way. Anyone got a clue?

Update

I discovered that the new System.Text.Json package also implements .NET Standard 2.0 so I installed it from NUget. Now I can copy the sample code from Github, mentioned earlier.

        Notification[] notifications = entities.ToArray();
        var itemsToInsert = new List<Tuple<PartitionKey, Stream>>();

        foreach (var notification in notifications)
        {
            notification.id = $"{notification.UserId}:{Guid.NewGuid()}";

            MemoryStream stream = new MemoryStream();
            await JsonSerializer.SerializeAsync(stream, notification);

            itemsToInsert.Add(new Tuple<PartitionKey, Stream>(new PartitionKey(notification.RoleId), stream));
        }

        List<Task> tasks = new List<Task>(notifications.Length);
        foreach (var item in itemsToInsert)
        {
            tasks.Add(Container.CreateItemStreamAsync(item.Item2, item.Item1)
                .ContinueWith((Task<ResponseMessage> task) =>
                {
                    using (ResponseMessage response = task.Result)
                    {
                        if (!response.IsSuccessStatusCode)
                        {
                            Console.WriteLine($"Received {response.StatusCode} ({response.ErrorMessage}).");
                        }
                        else
                        {
                        }
                    }
                }));
        }

        // Wait until all are done
        await Task.WhenAll(tasks);

I double checked that BulkInsert is enabled (or else the first method also won't work). Still there is a BadRequest and a NULL for errorMessage.

I also checked that the data isn't added to the container dispite the BadRequest.

I found the problem.

I've setup my Cosmos Context with the following options:

var cosmosSerializationOptions = new CosmosSerializationOptions();
cosmosSerializationOptions.PropertyNamingPolicy = CosmosPropertyNamingPolicy.CamelCase;

CosmosClientOptions cosmosClientOptions = new CosmosClientOptions();
cosmosClientOptions.SerializerOptions = cosmosSerializationOptions;

Hence the CamelCase convention. In my first (working) code sample, I would let the CosmosDB Context deserialize to JSON. He would serialize with this CamelCase convention, so my PartionKey UserId would be serialized into userId .

However, to reduce some RU's I will use the CreateItemStreamAsync that makes me responsible for the serialization. And there was the mistake, my property was defined like:

public int UserId { get; set; }

So he would be serialized to json UserId: 1 .

However, the partition key is defined as /userId . So if I add the JsonPropertyName attribute, it works:

[JsonPropertyName("userId")]
public int UserId { get; set; } 

...if only an error message would tell me that.

There is about 3% RU savings on using this CreateItemStream method. However, over time this would slowly save some RU's in total I guess.

It looks like the stream is not readable. Hence the bad request. I would make little modification to how MemoryStream is created:

foreach (var notification in notifications)
    {
        
        itemsToInsert.Add(new PartitionKey(notification.UserId), new MemoryStream(Encoding.UTF8.GetBytes(JsonConvert.SerializeObject(notification))));
    }

Of course, I am using Newtonsoft.json for jsonConvert.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM