简体繁体中英

DynamoDB data model for urls and associated keywords

原文 2021-11-16 15:36:50 7 1 database/ amazon-web-services/ nosql/ amazon-dynamodb

I have items in a DynamoDB table. Each item has a list key words against a URL (URL is partition key in my table) from which these words has been extracted. Now I want to query the table for one keyword and determine which URL/s has/have this particular word.

One way is to loop through each item in the table and then again loop through the respective list of keywords to complete the query. Another option is that I store each word as partition key in item and place respective URLs against each. But in this case my crawler lambda will be slowed.

What you think, there can be another way to achieve the desired results?

1 answers

In contrast to data modeling in relational databases, you design your DynamoDB schemas in such a way that reads are very quick and simple at the cost of more (compute-)expensive writes.

What you've done now is to design your table in a way that writes are cheap and reads are expensive.

In DynamoDB we think in terms of access patterns that your data model is supposed to serve. In your case that would be getUrlsByKeyword . The easiest solution would be to design your table like this:

keyword (Partition Key)	url (Sort Key)
keyword1	https://test.example.com
keyword1	https://test2.example.com
keyword1	https://test3.example.com
wordkey2	https://test.example.com
wordkey2	https://test3.example.com

This allows you to do a Query based on keyword=<keyword> which would return all your URLs that contain this keyword.

How would you update this table?

There's two cases you need to worry about under the assumption that you don't delete URLs from your table:

New URL with keywords
Existing URL with keywords

Solving 1) is easy: For each new keyword-url combination you add a record to the table above.

The update case 2) is a bit more annoying, because you need to figure out what already exists to change it. As a result of that we have a new access pattern getKeywordsByUrl which can't easily be served from the table we've defined so far, so we adjust it.

There is an easy trick we can do: we create an inverted index, meaning a Global Secondary Index that switches the partition and sort key of the base table. The GSI would look like this:

Name: GSI1
Partition key: url
Sort key: keyword

If we view GSI1, we see a table like this:

url (GSI1 Partition key)	keyword (GSI1 Sort Key)
https://test.example.com	keyword1
https://test.example.com	wordkey2
https://test2.example.com	keyword1
https://test3.example.com	keyword1
https://test3.example.com	wordkey2

Now we can easily fetch the keywords for a given URL using a Query on GSI1 with url=<url> . Based on it's result, you can add new keywords to the base table and delete no-longer-existing keywords as well.

Can I model data that require 3 columns to uniquely identify row in dynamodb?

Query Model and Join Associated Model

Not adding data to associated table

Structuring DynamoDB tables vs traditional relational model

How to model Student/Classes with DynamoDB (NoSQL)

How to model many to many relationship in dynamoDB

Migrating data from MySQL to DynamoDB

Storing time series data in DynamoDB

Cannot access attributes from associated model rails

rails how to use Associated Model with the admin namespace

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Can I model data that require 3 columns to uniquely identify row in dynamodb? Query Model and Join Associated Model Not adding data to associated table Structuring DynamoDB tables vs traditional relational model How to model Student/Classes with DynamoDB (NoSQL) How to model many to many relationship in dynamoDB Migrating data from MySQL to DynamoDB Storing time series data in DynamoDB Cannot access attributes from associated model rails rails how to use Associated Model with the admin namespace

Related Tags

DynamoDB data model for urls and associated keywords

Question

1 answers

solution1 0 2021-11-17 09:07:45

solution1
0 2021-11-17 09:07:45