简体   繁体   中英

DynamoDB GSI data modelling for an articles app

I want to create an articles application using serverless (AWS Lambda + DynamoDB + S3 for hosting the FE). I have some questions regarding the "1 table approach". The actions I want to follow:

  1. Get latest (6) articles sorted by date
  2. Get an article by id
  3. Get the prev/next article relative to the article opened (based on creation date)
  4. Get related articles by tags
  5. Get comments by article

I have created an initial spreadsheet for the information: 在此处输入图片说明

The first problem I have is that for action nr. 1, I cannot get all the articles based on date, I've added the SK for articles as a date, but because the PK has separate articles, each with its id: article-1, article-2.. and so on, I don't know how to fetch all the articles only by SK.

I then tried creating a LSI , but then I noticed that the LSI needs to have the PK the same as the table, so I can select based on LSI type = 'ARTICLE', but I still cannot selected them ordered by date (entities_sort value)

I know AWS says its good for PK to be unique, but then how do you group the data in this case?

I've created a GSI 在此处输入图片说明

This helps me get articles by type(GSI2PK)='ARTICLE' sorted by entities_sort (GSI2SK), but isn't there a better way of achieving this? Having your articles as a PK in a table, but somehow still being able to get them sorted by date?

Having GSI1PK, GSI1SK this way - I can get all the comments for an article using reverse lookup, so thats good.

But I still also don't know how to implement number 3. Get the prev/next article relative to the article opened (based on creation date): getting an article by id, check its creation date(entities_sort), then somehow get the next article before and after based on that creation date (entities_sort), is there a function in DynamoDB that can do this for me?

In my approach I try to query/process as few items as possible so I don't want to use filter functions, rather partition my information.

My question is, how should I achieve 1 and 3? And isn't creating 2 GSI's for such few actions overkill?

What is the pattern to have articles on a PK, unique with ids, but still being able to get them sorted by creation date?

Thank you

So what I've ended up doing is:

My access patterns in detail are:

  1. Get any Article by Id (for edit/delete)
  2. Get any Comment by Id (for edit/delete)
  3. Get any Tag by Id (for edit/delete)
  4. Get all Articles ordered by date
  5. Get all the Tags of an Article
  6. Get all comments for an article, sorted by date
  7. Get all Articles that have a specific tag, ordered by date (because I want to show only the last 3 ones)

在此处输入图片说明

This is the way I've implemented my model, and I can get all the informations needed.

Also, all my data is partitioned and the queries are really efficient, I always get exactly what I need and the ScannedDocuments value is always the number or returned objects.

The Global Secondary Index helps me query by Article Id and I get, all the comments and tags of that Article.

I've solved the many-to-many between Tags and Articles by a new record in the end: tag_id, article_date, arct_id, tag_id

So, if I want all articles that have a specific tag sorted by date I can query the PK of the table and sort by SK. If I want to get a single Tag (for edit/delete) I can use the GSI by: article_id, tag_id .. and I get the relation between them.

For getting all Articles sorted by date, i query PK: ARTICLE and an option condition if I want to get only the ones after a date or not I can condition the SK.

For all the comments and tags of an Article I can use the GSI with : article_link_pk: article_id and I get all comments and tags. If I want only comments I can say article_link_pk: article_id and article_link_sk: begins_with(article_link_sk, '2020') in this way I get only comments, without tags.

The data model in NoSQL Developer looks like this: 在此处输入图片说明

The GSI reverse lookup looks like this: 在此处输入图片说明

It's been a journey, but I feel like I finally got a grasp on how to do data modelling in DynamoDB

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM