[英]DynamoDB Table Design: How do I model a one to many relationship where I need all of the "one" items and one of "many" sorted by some attribute
I've spent my entire career working with denormalized relational databases.我的整个职业生涯都在使用非规范化的关系数据库。 I am having a hard time un-learning all of that in order to implement a single-table design that can handle a couple specific access patterns on an "App Store"-like personal project.
为了实现一个单表设计,可以在类似“App Store”的个人项目上处理几个特定的访问模式,我很难忘记所有这些。
Here's a quick ERD.这是一个快速的 ERD。 There is an App model identified by a platform (iOS, Android) and bundle identifier along with a Defaults map that is used when creating new versions.
有一个应用程序 model 由平台(iOS、Android)和捆绑标识符以及创建新版本时使用的默认值 map 标识。 Each App can have 0 to many Versions which are identified by a version number (which is a sequential numerical value and is unique within the context of an App).
每个 App 可以有 0 到多个版本,这些版本由版本号标识(这是一个连续的数值,并且在 App 的上下文中是唯一的)。 A version has an IsReleased attribute along with several others (like Name, Release Notes, Binary Path, etc).
一个版本具有 IsReleased 属性以及其他几个属性(如名称、发行说明、二进制路径等)。
Access Patterns访问模式
I'm having trouble with 1 though 4, this table is where I was headed.我遇到了 1 到 4 的问题,这张桌子是我要去的地方。 I'm having a hard time coming with a GSIs that will give me the all of the app items with a single version by sort order.
我很难使用 GSI,它会按排序顺序为我提供单个版本的所有应用程序项目。
pk ![]() |
sk ![]() |
Defaults![]() |
App Name![]() |
Version![]() |
IsReleased![]() |
Other Attributes![]() |
---|---|---|---|---|---|---|
app_ios_com.app.one |
defaults |
{... json... } |
||||
app_ios_com.app.one |
version_1 |
App One![]() |
1 ![]() |
1 ![]() |
||
app_ios_com.app.one |
version_2 |
App One![]() |
2 ![]() |
1 ![]() |
||
app_ios_com.app.one |
version_3 |
App One![]() |
3 ![]() |
1 ![]() |
||
app_ios_com.app.two |
defaults |
{... json... } |
||||
app_ios_com.app.two |
version_1 |
App Two![]() |
1 ![]() |
1 ![]() |
||
app_ios_com.app.two |
version_2 |
App Two![]() |
2 ![]() |
0 ![]() |
||
app_ios_com.app.two |
version_3 |
App Two![]() |
3 ![]() |
0 ![]() |
For example, for access pattern 1, I want:例如,对于访问模式 1,我想要:
pk ![]() |
sk ![]() |
Defaults![]() |
App Name![]() |
Version![]() |
IsReleased![]() |
Other Attributes![]() |
---|---|---|---|---|---|---|
app_ios_com.app.one |
version_3 |
App One![]() |
3 ![]() |
1 ![]() |
||
app_ios_com.app.two |
version_3 |
App Two![]() |
3 ![]() |
0 ![]() |
For example, for access pattern 3, I would want:例如,对于访问模式 3,我想要:
pk ![]() |
sk ![]() |
Defaults![]() |
App Name![]() |
Version![]() |
IsReleased![]() |
Other Attributes![]() |
---|---|---|---|---|---|---|
app_ios_com.app.one |
version_2 |
App One![]() |
3 ![]() |
1 ![]() |
||
app_ios_com.app.two |
version_1 |
App Two![]() |
1 ![]() |
1 ![]() |
Some data constraints that I have to keep in mind:我必须记住的一些数据限制:
I feel like the solution is right in front of me, but I can't put my finger on it.我觉得解决方案就在我面前,但我不能指望它。
TLDR; TLDR; The solution that springs to mind is a leaderboard pattern to cache the latest app versions in separate record(s).
想到的解决方案是排行榜模式,用于在单独的记录中缓存最新的应用程序版本。 Whenever a new version is added, DynamoDB Streams sends the change as an event to lambda, which then updates the denormalised Latest records.
每当添加新版本时, DynamoDB Streams都会将更改作为事件发送到 lambda,然后再更新非规范化的最新记录。
Note : One piece of information was missing from your excellent write-up: how often do you need to perform the latest
queries?注意:您出色的文章中缺少一条信息:您需要多久执行一次
latest
查询? If not very often, then "scan-and-done" will be OK for your current volumes.如果不是很频繁,那么“扫描并完成”将适用于您当前的卷。 If the answer is 1k
latest
queries per minute, then it's a different story.如果答案是每分钟 1k 个
latest
查询,那么情况就不同了。 The good news is that your basic table design is solid.好消息是您的基本表格设计是可靠的。
Latest
query optimisation can be implemented incrementally when performance/cost problems arise, without messing with the table design.当出现性能/成本问题时,可以增量实施
Latest
的查询优化,而不会弄乱表设计。
We will keep denormalised copie(s) of the latest versions, another sinful-sounding DynamoDB pattern .我们将保留最新版本的非规范化副本,这是另一种听起来很罪恶的DynamoDB 模式。 The Stream-triggered lambda will update those records using the update API when a version is added or changes release status.
当添加版本或更改发布状态时,流触发的 lambda 将使用更新 API 更新这些记录。 How to store the latest version info?
如何存储最新版本信息? We have several options:
我们有几种选择:
latest
data in a singleton record with map attributes {app1: {latest version copy}, app2: ...}
.latest
数据存储在具有 map 属性{app1: {latest version copy}, app2: ...}
的 singleton 记录中。 You can put more logic into the records to handle the isReleased
items, or simply fetch the record and filter in your backend.isReleased
项目,或者简单地获取记录并在后端进行过滤。app_id
.app_id
。 The records have the same info as in #1.GSI1PK=Latest#Released AND begins_with(GSI1SI, "IOS")
GSI1PK=Latest#Released AND begins_with(GSI1SI, "IOS")
GSI1PK GSI1SK
Latest app_ios_com.app.one
Latest IOS#app_ios_com.app.one
Latest#Released app_ios_com.app.one
Latest#Released IOS#app_ios_com.app.one
Note : If you have high query volumes and low cardinality, hot partitions may be a problem for these "leaderboard" type deormalised patterns.注意:如果您有高查询量和低基数, 热分区可能是这些“排行榜”类型非正规化模式的问题。 If this becomes a problem, you can address it by keeping multiple copies of each "latest" record, eg have X copies that are queried randomly
latest-copy1
, latest-copy2
, latest-copy3
.如果这成为一个问题,您可以通过保留每个“最新”记录的多个副本来解决它,例如,随机查询 X 个副本
latest-copy1
, latest-copy2
, latest-copy3
。 Amazon calls this pattern sharding using calculated suffixes . Amazon 使用计算后缀调用此模式分片。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.