简体   繁体   中英

DynamoDB one-to-many design thoughts and issue

I am building a web application using AWS DynamoDB as the database for my application. To be honest I am quite a newbie to the AWS DyanmoDB. One of the reasons I use DynamoDB is because I want to learn it. Now, I am having a bit of a problem designing my tables around the one-to-many relationship.

Now, I have a table with the following attributes.

RegionName (PK) | PostId (RK) | ExpiresAt (LSI) | Message | CreateAt | Topic (LSI) | Status (LSI) |

I put all the data in one table. I have read in my articles about DynamoDB saying that it is always the first step to understand the Access Pattern before designing the tables. The followings are the access pattern to my database.

  • All the regions
  • All the posts from a region
  • All the posts from a region that will expire before
  • All the posts in a region with the status open and topic, "Astronomy"

There are still some more of course. But I found the followings issue with storing all the data or attributes in one single table.

  • I cannot update the RegionName because it is the partition key.
  • I can't update the Topic or Status or ExpiresAt attributes because they are local secondary index.
  • I cannot add the new regions without adding the post with it because I am using the composite primary key. For example, I might add the new region as admin of the website. Then some parts of my website will be displaying a list of regions.

So, how can I solve these problems? Is my design correct?

To solve the region problem, as a newbie to the DynamoDB, I tried to split the tables into two. Region table and Post table. Then Post table will have the region_id. This way, I can add the new region separately. But is it the correct way to solve the problems?

I think you're off to a great start. Here's my take on implementing your first three access patterns. Keep in mind that there are many ways to model data in DynamoDB and this is just one of them.

I'm implementing your access patterns using two global secondary indexes. The base table looks like this:

在此处输入图像描述

The base table will implement your "get posts by region" access pattern.

For your second access pattern, "fetch posts by expiration date", I created GSI2 using the PK of the base table and an SK of the expiration time. This will let you filter for posts in a region based on the expires_at time:

在此处输入图像描述

Your third access pattern, "fetch post within a region by status and topic", I've opted to take a slightly different approach. I created a new field named status_topic which concatenates the status and topic fields. I then defined another GSI using the PK of the main table and an SK of status_topic . That view looks like this:

在此处输入图像描述

This lets you implement your third access pattern by searching GSI1 where PK = REGION#<region_id> and SK = OPEN#ASTRONOMY . Notice this pattern also lets you search for posts by status: PK = REGION#<region_id> and SK begins_with OPEN# or PK = REGION#<region_id> and SK begins_with CLOSED#

You also mention that you have an access pattern around listing all regions. A simple way to do this would be to create a separate partition that lists all available regions:

在此处输入图像描述

If you have access patterns around regions (eg fetch region by name) or many regions, you might consider creating an item collection to store the regions:

在此处输入图像描述

Notice I replaced the region names with numerical ID's and moved the name into an attribute. This is to illustrate that you can make the region name separate from the unique primary key.

Hope this helps get you unstuck!

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM