简体   繁体   中英

Copy-activity from Blob Storage-Account to CosmosDb is very slow

Situation:

I'm using the copy-activity from azure-data-factory to copy one json-file with 500 MB from a storage-account-blob to CosmosDB and from CosmosDb to a storage-Account-blob

The AzureBlobStorageLinkedService is configured with a SAS-Token .

Times:

CosmosDb to a storage-Account-blob: 4 minutes

Storage-account-blob to CosmosDB: 2 hours - over 7 hours (timeout)

CosmosDB:

Before copy-activity will be started, an empty collection with 20.000 RU/s will be created. I looked at the metrics of CosmosDB and it is really bored. There are only a few 429 errors. We have "default indexing-configuration" and a partitionKey. This means that we have data with several partitionKeys from several partitionKey-ranges (partitions)

Data:

In the json-file there are 48.000 json-objects. Some are small and some can have 200 KB .

Tries:

I tried with different WriteBatchSizes:

5: 2 hours

100: 2 hours

10.000: 7 hours (timeout)

I tried it with same/different regions => no difference

I tried it with smaller files => they are much faster (500 KB/s instead of 50 KB/s)

Question:

Why it is so slowly? Is the file with 500 MB too large?

I tried with very high throughput-values and it worked fine:

1.000.000 RU/s: 9 Minuten ✔
100.000 RU/s: 15 Minuten ✔

But I have to think on scaling down after data-transfer ist complete, because of costs!!!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM