简体   繁体   中英

Best practice for DynamoDB composite primary key travelling inside the system (partition key and sort key)

I am working on a system where I am storing data in DynamoDB and it has to be sorted chronologically. For partition_key I have an id (uuid) and for sort_key I have a date_created value. Now originally it was enough to save unique entries using only the ID, but then a problem arose that this data was not being sorted as I wanted, so a sort_key was added.

Using python boto3 library, it would be enough for me to get, update or delete items using only the id primary key since I know that it is always unique:

import boto3

resource = boto3.resource('dynamodb')
table = resource.Table('my_table_name')

table.get_item(
    Key={'item_id': 'unique_item_id'}
)
table.update_item(
    Key={'item_id': 'unique_item_id'}
)
table.delete_item(
    Key={'item_id': 'unique_item_id'}
)

However, DynamoDB requires a sort key to be provided as well, since primary keys are composed partition key and sort key.

table.get_item(
    Key={
        'item_id': 'unique_item_id',
        'date_created': 12345          # timestamp
    }
)

First of all, is it the right approach to use sort key to sort data chronologically or are there better approaches?

Secondly, what would be the best approach for transmitting partition key and sort key across the system? For example I have an API endpoint which accepts the ID, by this ID the backend performs a get_item query and returns the corresponding data. Now since I also need the sort key, I was thinking about using a hashing algorithm internally, where I would hash a JSON like this:

{
    "item_id": "unique_item_id",
    "date_created": 12345
}

and a single value then becomes my identifier for this database entry. I would then dehash this value before performing any database queries. Is this the approach common?

First of all, is it the right approach to use sort key to sort data chronologically

Sort keys are the means of sorting data in DynamoDB. Using a timestamp as a sort key field is the right thing to do, and a common pattern in DDB.

DynamoDB requires a sort key to be provided ... since primary keys are composed partition key and sort key.

This is true. However, when reading from DDB it is possible to specify only the partition key using the query operation (as opposed to the get_item operation which requires the full primary key). This is a powerful construct that lets you specify which items you want to read from a given partition.

You may want to look into KSUIDs for your unique identifiers. KSUIDs are like UUIDs, but they contain a time component. This allows them to be sorted by generation time. There are several KSUID libraries in python, so you don't need to implement the algorithm yourself.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM