I am confused about how partition affects the size limit and throughput value for Azure Cosmos DB (in our case, we are using documentdb). If I understand the documentation correctly.
for a partitioned collection, the 10G storage limit applies to each partition?
the throughput value ex. 400RU/S applies to each partition, not collection?
Taken from Azure Cosmos DB FAQ :
What is a collection?
A collection is a group of documents and their associated JavaScript application logic. A collection is a billable entity, where the cost is determined by the throughput and used storage. Collections can span one or more partitions or servers and can scale to handle practically unlimited volumes of storage or throughput.
Collections are also the billing entities for Azure Cosmos DB. Each collection is billed hourly, based on the provisioned throughput and used storage space. For more information, see Azure Cosmos DB Pricing.
For more info, see the bold text in the quote below:
Taken from Azure Cosmos DB Pricing :
Provisioned throughput
At any scale, you can store data and provision throughput capacity. Each container is billed hourly based on the amount of data stored (in GBs) and throughput reserved in units of 100 RUs/second, with a minimum of 400 RUs/second. Unlimited containers have a minimum of 100 RUs/second per partition.
Taken from Request Units in Azure Cosmos DB :
When starting a new collection, table or graph, you specify the number of request units per second (RU per second) you want reserved. Based on the provisioned throughput, Azure Cosmos DB allocates physical partitions to host your collection and splits/rebalances data across partitions as it grows.
The other answers here provide a great starting point on throughput provisioning but failed to touch on an important point that doesn't get mentioned often in the docs.
Your throughput is actually divided across the number of physical partitions in your collection. So for a multi partition collection provisioned for 1000RU/s with 10 physical partitions it's actually 100RU/s per partition. So if you have hot partitions that get accessed more frequently you'll receive throttling errors even though you haven't exceeded the total RU assigned the collection.
For a single partition collection you obviously get the full RU assigned for that partition since it's the only one.
If you're using a multi-partition collection you should strive to pick a partition key that has an even access pattern so that your workload can be evenly distributed across the underlying partitions without bottle necking.
- for a partitioned collection, the 10G storage limit applies to each partition?
That is correct. Each partition in a partitioned collection can be a maximum of 10GB in size.
- the throughout value ex. 400RU/S applies to each partition, not collection?
The throughput is at collection level and not at partition level. Further minimum RU/S for a partitioned collection is 2500 RU/S and not 400RU/S. 400RU/S is the default for a non-partitioned collection.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.