I'm using the copy-activity from azure-data-factory to copy one json-file with 500 MB from a storage-account-blob to CosmosDB and from CosmosDb to a storage-Account-blob
The AzureBlobStorageLinkedService is configured with a SAS-Token .
CosmosDb to a storage-Account-blob: 4 minutes
Storage-account-blob to CosmosDB: 2 hours - over 7 hours (timeout)
Before copy-activity will be started, an empty collection with 20.000 RU/s will be created. I looked at the metrics of CosmosDB and it is really bored. There are only a few 429 errors. We have "default indexing-configuration" and a partitionKey. This means that we have data with several partitionKeys from several partitionKey-ranges (partitions)
In the json-file there are 48.000 json-objects. Some are small and some can have 200 KB .
I tried with different WriteBatchSizes:
5: 2 hours
100: 2 hours
10.000: 7 hours (timeout)
I tried it with same/different regions => no difference
I tried it with smaller files => they are much faster (500 KB/s instead of 50 KB/s)
Why it is so slowly? Is the file with 500 MB too large?
I tried with very high throughput-values and it worked fine:
1.000.000 RU/s: 9 Minuten ✔
100.000 RU/s: 15 Minuten ✔
But I have to think on scaling down after data-transfer ist complete, because of costs!!!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.