简体   繁体   English

Azure Cosmos DB分区键

[英]Azure Cosmos DB partition key

Mostly we need to search on PublisherId and PlanId in our current system where the Model structure is as below:- 通常,我们需要在模型结构如下所示的当前系统中搜索PublisherId和PlanId:

Publisher Model: Publisher Id Publisher Name ….. 发布者模型:发布者ID发布者名称…..

Plan Model: Plan Id Plan Name Publisher Id ….. 计划模型:计划ID计划名称发布者ID…..

Relationship between Publisher and Plan Model is 1:M. 发布者与计划模型之间的关系为1:M。

Scenario: We cannot take Publisher Id or Plan Id as partition key because we have 3-5 publishers they used to submit bulk data that might cross 10 GB limit soon. 场景:我们不能将发布者ID或计划ID作为分区键,因为我们有3-5个发布者,他们曾经提交过可能很快超过10 GB限制的批量数据。

From what is given Publisher Id does sound like a good candidate as a partition key but not a sufficient one. 从给出的结果来看,Publisher Id听起来很适合作为分区键,但还不够。

I would suggest combining with another value to create your partition to spread the data. 我建议与另一个值组合以创建分区以分散数据。 One that might work well is year. 一年可能效果很好。 That is create a id that combines the Publisher Id with the year the document in question was created, eg <PublisherId>.2019 (you could include month if you have very large numbers of document per publisher per year). 也就是说,创建一个将发布者ID与相关文档的创建年份结合在一起的ID,例如<PublisherId>.2019 (如果您每个发布者每年都有大量文档,则可以包含月份)。

This allows for archiving of older content quite easily in time and could provide benefits to queries though that depends on your system. 这样就可以很容易地及时归档较早的内容,并且可以为查询带来好处,尽管这取决于您的系统。

As you note you will need to look at the spread of your data and pick a partition that will work as you scale. 正如您所注意到的,您将需要查看数据的传播并选择一个在扩展时将起作用的分区。

The 10 GB limit is on a Logical partition and you should not worry about it if you are choosing a partitionKey that is broad enough. 10 GB的限制在逻辑分区上,如果选择足够宽的partitionKey,则不必担心。

I assumed your document would look something like this and created a new synthetic partition key - publisherIdentifier. 我假设您的文档看起来像这样,并创建了一个新的合成分区键-PublisherIdentifier。

{
  "publisherIdentifier": "1.Content.USA",
  "publisherId": "1",
  "publisherName": "A",
  "publisherType": "Content",
  "publisherCountry": "USA",
  "plans": [{"planId": "P1"},{"planId": "P2"},{"planId": "P3"}]
}

You can then query the Publishers based on their plan 然后,您可以根据他们的计划查询发布者

SELECT VALUE publisher.publisherName
FROM publisher
JOIN plans IN publisher.plans
where plans.planId = "P1"

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM