简体   繁体   English

Azure Cosmos DB 表中查找表的分区键

[英]Partition key for a lookup table in Azure Cosmos DB Tables

I have a very simple lookup table I want to call from an Azure function.我有一个非常简单的查找表,我想从 Azure 函数调用。

Schema is incredibly simple: Name | Value 1 | Value 2架构非常简单: Name | Value 1 | Value 2 Name | Value 1 | Value 2

Name will be unique, but value 1 and value 2 will not be.名称将是唯一的,但值 1 和值 2 将不是。 There is no other data in the lookup table.查找表中没有其他数据。

For an Azure Table you need a partition key and a row key.对于 Azure 表,您需要一个分区键和一个行键。 Obviously the rowkey would be the Name field.显然,rowkey 将是Name字段。

What exactly should I use for Partition Key?我究竟应该使用什么作为分区键?

Right now, I'm using a constant because there won't be a ton of data (maybe a couple hundred rows at most) but using a constant seems to go against the point.现在,我正在使用常量,因为不会有大量数据(最多可能有几百行),但使用常量似乎有悖于这一点。

This answer applies to all Cosmos DB containers, including Tables.此答案适用于所有 Cosmos DB 容器,包括表。

When does it make sense to store your Cosmos DB container in a single partition (use a constant as the partition key)?什么时候将 Cosmos DB 容器存储在单个分区中(使用常量作为分区键)有意义?

  • If you are sure the data size of your container will always remain well under 10GB.如果您确定容器的数据大小将始终保持在 10GB 以下。
  • If you are sure the throughput requirement for your container will always remain under 10,000 RU/s (RU per second).如果您确定容器的吞吐量要求将始终保持在 10,000 RU/s(每秒 RU)以下。

If either of the above conditions are false, or if you are not sure about future growth of data size or throughput requirements then using a partition key based on the guidelines below will allow the container to scale.如果上述任一条件不成立,或者您不确定数据大小或吞吐量要求的未来增长,那么根据以下指南使用分区键将允许容器扩展。

How partitioning works in Cosmos DB Cosmos DB 中的分区工作原理

Cosmos groups container items into a set of logical partitions based on the partition key. Cosmos 根据分区键将容器项分组为一组逻辑分区 These logical partitions are then mapped to physical partitions .然后将这些逻辑分区映射到物理分区 A physical partition is the unit of compute/storage which makes up the underlying database infrastructure.物理分区是构成底层数据库基础架构的计算/存储单元。

You can determine how your data is split into logical partitions by your choice of partition key.您可以通过选择分区键来确定如何将数据拆分为逻辑分区。 You have no control over how your logical partitions are mapped to physical partitions, Cosmos handles this automatically and transparently.您无法控制逻辑分区如何映射到物理分区,Cosmos 会自动且透明地处理此问题。

Distributing your container across a large number of physical partitions is the way Cosmos allows the container to scale to virtually unlimited size and throughput.跨大量物理分区分布容器是 Cosmos 允许容器扩展到几乎无限大小和吞吐量的方式。

Each logical partition can contain a maximum of 10GB of data.每个逻辑分区最多可以包含 10GB 的数据。 An unpartitioned container can have a maximum throughput of 10,000 RU/s which implies there is a limit of 10,000 RU/s per logical partition.未分区容器的最大吞吐量为 10,000 RU/s,这意味着每个逻辑分区的限制为 10,000 RU/s。

The RU/s allocated to your container are evenly split across all physical partitions hosting the container's data.分配给容器的 RU/s 平均分配到托管容器数据的所有物理分区。 For instance, if your container has 4,000 RU/s allocated and its logical partitions are spread across 4 physical partitions then each physical partition will have 1,000 RU/s allocated to it, which also means that if one of your physical partitions is under a heavly load or 'hot', it will get rate-limited at 1,000 RU/s, not at 4,000.例如,如果您的容器分配了 4,000 RU/s 并且其逻辑分区分布在 4 个物理分区上,那么每个物理分区将分配给它 1,000 RU/s,这也意味着如果您的一个物理分区处于重负载或“热”,它将被限制在 1,000 RU/s,而不是 4,000。 This is why it is very important to choose a partition key that spreads your data, and access to the data, evenly across partitions.这就是为什么选择一个分区键非常重要的原因,它可以跨分区均匀分布数据和访问数据。

If your container is in a single logical partition, it will always be mapped to a single physical partition and the entire allocation of RU/s for the container will always be available.如果您的容器位于单个逻辑分区中,它将始终映射到单个物理分区,并且容器的整个 RU/s 分配将始终可用。

All Cosmos DB transactions are scoped to a single logical partition, and the execution of a stored procedure or trigger is also scoped to a single logical partition.所有 Cosmos DB 事务的范围都限定在单个逻辑分区内,并且存储过程或触发器的执行也限定在单个逻辑分区内。

How to choose a good partition key如何选择一个好的分区键

Choose a partition key that will evenly distribute your data across logical partitions, which in turn will help ensure the data is evenly mapped across physical partitions.选择一个分区键,将您的数据均匀分布在逻辑分区之间,这反过来将有助于确保数据在物理分区之间均匀映射。 This will prevent 'bottleneck' or 'hot' partitions which will cause rate-limiting and may increase your costs.这将防止“瓶颈”或“热”分区,这将导致速率限制并可能增加您的成本。

Choose a partition key that will be the filter criteria for a high percentage of your queries.选择一个分区键,作为大部分查询的过滤条件。 By providing the partition key as filter to your query, Cosmos can efficiently route your query to the correct partition.通过为您的查询提供分区键作为过滤器,Cosmos 可以有效地将您的查询路由到正确的分区。 If the partition key is not supplied it will result in a 'fan out' query, which is sent to all partitions which will increase your RU cost and may hinder performance.如果未提供分区键,则会导致“扇出”查询,该查询将发送到所有分区,这将增加您的 RU 成本并可能影响性能。 If you frequently filter based on multiple fields see this article for guidance.如果您经常根据多个字段进行过滤,请参阅本文以获取指导。

Summary概括

  • The primary purpose of partitioning your containers in Cosmos DB is allowing the continers to scale in terms of both storage and throughput.在 Cosmos DB 中对容器进行分区的主要目的是允许容器在存储和吞吐量方面进行扩展。
  • Small containers which will not grow significantly in data size or throughput requirements can use a single partition.数据大小或吞吐量要求不会显着增长的小型容器可以使用单个分区。
  • Large containers, or containers expected to grow in data size or throughput requirements should be partitioned using a well chosen partition key.大型容器或预计会增加数据大小或吞吐量要求的容器应使用精心选择的分区键进行分区。
  • The choice of partition key is critical and may significantly impact your ability to scale, your RU cost and the performance of your queries.分区键的选择至关重要,可能会显着影响您的扩展能力、RU 成本和查询性能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM