简体   繁体   中英

Throughput value for Azure Cosmos db

I am confused about how partition affects the size limit and throughput value for Azure Cosmos DB (in our case, we are using documentdb). If I understand the documentation correctly.

  1. for a partitioned collection, the 10G storage limit applies to each partition?

  2. the throughput value ex. 400RU/S applies to each partition, not collection?

  1. Whether you use Single-partition collections or multi partition collections, each partition can be up to 10 Gb. This means that a single-partition collection can not exceed that size, where a multi partition collection can .

Taken from Azure Cosmos DB FAQ :

What is a collection?

A collection is a group of documents and their associated JavaScript application logic. A collection is a billable entity, where the cost is determined by the throughput and used storage. Collections can span one or more partitions or servers and can scale to handle practically unlimited volumes of storage or throughput.
Collections are also the billing entities for Azure Cosmos DB. Each collection is billed hourly, based on the provisioned throughput and used storage space. For more information, see Azure Cosmos DB Pricing.

  1. Billing is per Collection , where one collection can have one or more Partitions . Since Azure allocates partitions to host your collection, the amount of RU's needs to be per collection. Otherwise a customer with lots and lots of partitions would get way more RU's than a different customer who has an equal collection, but way less partitions.

For more info, see the bold text in the quote below:

Taken from Azure Cosmos DB Pricing :

Provisioned throughput

At any scale, you can store data and provision throughput capacity. Each container is billed hourly based on the amount of data stored (in GBs) and throughput reserved in units of 100 RUs/second, with a minimum of 400 RUs/second. Unlimited containers have a minimum of 100 RUs/second per partition.

Taken from Request Units in Azure Cosmos DB :

When starting a new collection, table or graph, you specify the number of request units per second (RU per second) you want reserved. Based on the provisioned throughput, Azure Cosmos DB allocates physical partitions to host your collection and splits/rebalances data across partitions as it grows.

The other answers here provide a great starting point on throughput provisioning but failed to touch on an important point that doesn't get mentioned often in the docs.

Your throughput is actually divided across the number of physical partitions in your collection. So for a multi partition collection provisioned for 1000RU/s with 10 physical partitions it's actually 100RU/s per partition. So if you have hot partitions that get accessed more frequently you'll receive throttling errors even though you haven't exceeded the total RU assigned the collection.

For a single partition collection you obviously get the full RU assigned for that partition since it's the only one.

If you're using a multi-partition collection you should strive to pick a partition key that has an even access pattern so that your workload can be evenly distributed across the underlying partitions without bottle necking.

  1. for a partitioned collection, the 10G storage limit applies to each partition?

That is correct. Each partition in a partitioned collection can be a maximum of 10GB in size.

  1. the throughout value ex. 400RU/S applies to each partition, not collection?

The throughput is at collection level and not at partition level. Further minimum RU/S for a partitioned collection is 2500 RU/S and not 400RU/S. 400RU/S is the default for a non-partitioned collection.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM