简体   繁体   English

Azure Cosmos DB集合的分区键

[英]Partition key for Azure Cosmos DB collection

I am bit new to Azure Cosmos DB and trying to understand the concepts. 我对Azure Cosmos DB有点陌生,并试图理解这些概念。

I want help to decide the the best possible partition key for DocumentDB collection. 我需要帮助来确定DocumentDB集合的最佳最佳分区键。 Please refer image below which have possible partitions using different partition keys. 请参考下图,其中可能有使用不同分区键的分区。

这个

As mentioned in the blog post here , 由于在博客帖子中提到这里

An ideal partition key is one that appears frequently as a filter in your queries and has sufficient cardinality to ensure your solution is scalable. 理想的分区键是在查询中经常作为过滤器出现的分区键,并且具有足够的基数以确保您的解决方案可扩展。

From above line, I think, in my case, UserId can be used as partition key. 从上面这一行,我认为,就我而言,UserId可以用作分区键。

Can someone please suggest me which key is the best possible candidate for partition key? 有人可以建议我哪个密钥是分区密钥的最佳候选者吗?

From the 10 things to know about DocumentDB Partitioned Collections and micro official document , you could find lots of very good advice about choice of partitioning key, so I'm not going to repeat here. 有关DocumentDB Partitioned Collections微型正式文档10件事中 ,您可以找到很多有关选择分区键的很好的建议,因此在此不再赘述。

The selection of partitioning keys depends on the data stored in the database and the frequent query filtering criteria. 分区键的选择取决于数据库中存储的数据和频繁查询过滤条件。

It is often advised to partition on something like userid which is good if you have. 通常建议对像userid这样的东西进行分区,如果有的话,这是很好的。 Suppose your business logic has many queries for a given userid and want to look up no more than a few hundred entries. 假设您的业务逻辑对给定的userid有很多查询,并且希望查找的条目不超过几百个。 In such cases the data can be quickly extracted from a single partition without the overhead of having to collate data across partitions. 在这种情况下,可以从单个分区中快速提取数据,而不必在多个分区中整理数据。

However, if you have millions of records for the user then partitioning on userid is perhaps the worst option as extracting large volumes of data from a single partition will soon exceed the overhead of collation. 但是,如果您有数百万条user记录,那么对userid进行分区可能是最糟糕的选择,因为从单个分区中提取大量数据将很快超过整理的开销。 In such cases you want to distribute user data as evenly as possible over all partitions. 在这种情况下,您希望在所有分区上尽可能均匀地分布用户数据。 You may need to find another column to be the partition key. 您可能需要找到另一列作为分区键。

So , if the data volume is very large, I suggest that you do some simple tests based on your business logic and choose the best partitioning key for your performance. 因此,如果数据量很大,建议您根据业务逻辑进行一些简单的测试,并为性能选择最佳的分区键。 After all, the partitioning key cannot be changed once it is set up. 毕竟,分区密钥一旦设置便无法更改。

Hope it helps you. 希望对您有帮助。

It depends, but here are few things to consider: 这取决于,但是这里有几件事情要考虑:

The blog post you mentioned say: 您提到的博客文章说:

Additionally, the storage size for documents belonging to the same partition key is limited to 10GB. 此外,属于相同分区键的文档的存储大小限制为10GB。 An ideal partition key is one that appears frequently as a filter in your queries and has sufficient cardinality to ensure your solution is scalable. 理想的分区键是在查询中经常作为过滤器出现的分区键,并且具有足够的基数以确保您的解决方案可扩展。

Also, I really recommend to check this post and video, https://docs.microsoft.com/en-us/azure/cosmos-db/partition-data , 另外,我真的建议您查看这篇文章和视频, https://docs.microsoft.com/zh-cn/azure/cosmos-db/partition-data

The choice of the partition key is an important decision that you have to make at design time. 分区键的选择是您在设计时必须做出的重要决定。 You must pick a property name that has a wide range of values and has even access patterns. 您必须选择一个属性值,该属性值的范围很广,甚至具有访问模式。

So make sure to choose a partition Key that has many values and meets those requirements. 因此,请确保选择具有许多值并满足那些要求的分区键。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM