简体   繁体   English

DynamoDB 表设计:我如何 model 建立一对多关系,我需要所有“一”项和按某些属性排序的“多”项之一

[英]DynamoDB Table Design: How do I model a one to many relationship where I need all of the "one" items and one of "many" sorted by some attribute

I've spent my entire career working with denormalized relational databases.我的整个职业生涯都在使用非规范化的关系数据库。 I am having a hard time un-learning all of that in order to implement a single-table design that can handle a couple specific access patterns on an "App Store"-like personal project.为了实现一个单表设计,可以在类似“App Store”的个人项目上处理几个特定的访问模式,我很难忘记所有这些。

Here's a quick ERD.这是一个快速的 ERD。 There is an App model identified by a platform (iOS, Android) and bundle identifier along with a Defaults map that is used when creating new versions.有一个应用程序 model 由平台(iOS、Android)和捆绑标识符以及创建新版本时使用的默认值 map 标识。 Each App can have 0 to many Versions which are identified by a version number (which is a sequential numerical value and is unique within the context of an App).每个 App 可以有 0 到多个版本,这些版本由版本号标识(这是一个连续的数值,并且在 App 的上下文中是唯一的)。 A version has an IsReleased attribute along with several others (like Name, Release Notes, Binary Path, etc).一个版本具有 IsReleased 属性以及其他几个属性(如名称、发行说明、二进制路径等)。

Access Patterns访问模式

  1. List the latest version of every app.列出每个应用程序的最新版本。
  2. List the latest version of every app for a given platform.列出给定平台的每个应用程序的最新版本。
  3. List the latest version of every app where IsReleased is 1.列出 IsReleased 为 1 的每个应用程序的最新版本。
  4. List the latest version of every app for a given platform where IsReleased is 1.列出 IsReleased 为 1 的给定平台的每个应用程序的最新版本。
  5. Get the latest version of a specific app.获取特定应用的最新版本。
  6. Get the latest version of a specific app where IsReleased is 1.获取 IsReleased 为 1 的特定应用的最新版本。
  7. Get all versions of a specific app.获取特定应用的所有版本。
  8. Get all versions of a specific app where IsReleased is 1.获取 IsReleased 为 1 的特定应用的所有版本。
  9. Get the Default attribute for a specific app.获取特定应用的默认属性。

I'm having trouble with 1 though 4, this table is where I was headed.我遇到了 1 到 4 的问题,这张桌子是我要去的地方。 I'm having a hard time coming with a GSIs that will give me the all of the app items with a single version by sort order.我很难使用 GSI,它会按排序顺序为我提供单个版本的所有应用程序项目。

pk PK sk sk Defaults默认值 App Name应用名称 Version版本 IsReleased被释放 Other Attributes其他属性
app_ios_com.app.one defaults {... json... }
app_ios_com.app.one version_1 App One应用一 1 1 1 1
app_ios_com.app.one version_2 App One应用一 2 2 1 1
app_ios_com.app.one version_3 App One应用一 3 3 1 1
app_ios_com.app.two defaults {... json... }
app_ios_com.app.two version_1 App Two应用二 1 1 1 1
app_ios_com.app.two version_2 App Two应用二 2 2 0 0
app_ios_com.app.two version_3 App Two应用二 3 3 0 0

For example, for access pattern 1, I want:例如,对于访问模式 1,我想要:

pk PK sk sk Defaults默认值 App Name应用名称 Version版本 IsReleased被释放 Other Attributes其他属性
app_ios_com.app.one version_3 App One应用一 3 3 1 1
app_ios_com.app.two version_3 App Two应用二 3 3 0 0

For example, for access pattern 3, I would want:例如,对于访问模式 3,我想要:

pk PK sk sk Defaults默认值 App Name应用名称 Version版本 IsReleased被释放 Other Attributes其他属性
app_ios_com.app.one version_2 App One应用一 3 3 1 1
app_ios_com.app.two version_1 App Two应用二 1 1 1 1

Some data constraints that I have to keep in mind:我必须记住的一些数据限制:

  • There are currently only 10 to 20 apps, but I need to be able to support hundreds目前只有10到20个应用程序,但我需要能够支持数百个
  • Most apps will have 100 to 200 versions with 20 to 30 released versions.大多数应用程序将有 100 到 200 个版本,其中有 20 到 30 个发布版本。 The biggest app has 1000 versions of which 50 are released.最大的应用程序有 1000 个版本,其中发布了 50 个。
  • In the back-end, the IsReleased flag will typically be toggled from 0 to 1, but will occasionally be toggled from 0 to 1.在后端,IsReleased 标志通常会从 0 切换到 1,但偶尔会从 0 切换到 1。
  • The average version item is approximately 2 KB.平均版本项约为 2 KB。
  • The access pattern variations where IsReleased is 1 are more frequently used by a significant margin. IsReleased 为 1 的访问模式变化更频繁地使用。

I feel like the solution is right in front of me, but I can't put my finger on it.我觉得解决方案就在我面前,但我不能指望它。

TLDR; TLDR; The solution that springs to mind is a leaderboard pattern to cache the latest app versions in separate record(s).想到的解决方案是排行榜模式,用于在单独的记录中缓存最新的应用程序版本。 Whenever a new version is added, DynamoDB Streams sends the change as an event to lambda, which then updates the denormalised Latest records.每当添加新版本时, DynamoDB Streams都会将更改作为事件发送到 lambda,然后再更新非规范化的最新记录。

Note : One piece of information was missing from your excellent write-up: how often do you need to perform the latest queries?注意:您出色的文章中缺少一条信息:您需要多久执行一次latest查询? If not very often, then "scan-and-done" will be OK for your current volumes.如果不是很频繁,那么“扫描并完成”将适用于您当前的卷。 If the answer is 1k latest queries per minute, then it's a different story.如果答案是每分钟 1k 个latest查询,那么情况就不同了。 The good news is that your basic table design is solid.好消息是您的基本表格设计是可靠的。 Latest query optimisation can be implemented incrementally when performance/cost problems arise, without messing with the table design.当出现性能/成本问题时,可以增量实施Latest的查询优化,而不会弄乱表设计。

Denormalising the Latest Versions非规范化最新版本

We will keep denormalised copie(s) of the latest versions, another sinful-sounding DynamoDB pattern .我们将保留最新版本的非规范化副本,这是另一种听起来很罪恶的DynamoDB 模式 The Stream-triggered lambda will update those records using the update API when a version is added or changes release status.当添加版本或更改发布状态时,流触发的 lambda 将使用更新 API 更新这些记录。 How to store the latest version info?如何存储最新版本信息? We have several options:我们有几种选择:

  1. Store all latest data in a singleton record with map attributes {app1: {latest version copy}, app2: ...} .将所有latest数据存储在具有 map 属性{app1: {latest version copy}, app2: ...}的 singleton 记录中。 You can put more logic into the records to handle the isReleased items, or simply fetch the record and filter in your backend.您可以将更多逻辑放入记录中以处理isReleased项目,或者简单地获取记录并在后端进行过滤。
  2. Use a Global Secondary Index with one record per app.使用每个应用程序一条记录的全球二级索引。 Each record has "latest" as the GSI1PK and GSI1SK of app_id .每条记录都有“最新”作为 app_id 的 GSI1PK 和app_id The records have the same info as in #1.记录具有与#1 中相同的信息。
  3. Use as GSI with multiple records per app.用作 GSI,每个应用程序有多个记录。 Something like this seems to work.像这样的东西似乎有效。 For instance, query #4 would use GSI1PK=Latest#Released AND begins_with(GSI1SI, "IOS")例如,查询 #4 将使用GSI1PK=Latest#Released AND begins_with(GSI1SI, "IOS")
GSI1PK              GSI1SK
Latest              app_ios_com.app.one
Latest              IOS#app_ios_com.app.one
Latest#Released     app_ios_com.app.one
Latest#Released     IOS#app_ios_com.app.one

Note : If you have high query volumes and low cardinality, hot partitions may be a problem for these "leaderboard" type deormalised patterns.注意:如果您有高查询量和低基数, 热分区可能是这些“排行榜”类型非正规化模式的问题。 If this becomes a problem, you can address it by keeping multiple copies of each "latest" record, eg have X copies that are queried randomly latest-copy1 , latest-copy2 , latest-copy3 .如果这成为一个问题,您可以通过保留每个“最新”记录的多个副本来解决它,例如,随机查询 X 个副本latest-copy1latest-copy2latest-copy3 Amazon calls this pattern sharding using calculated suffixes . Amazon 使用计算后缀调用此模式分片

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM