简体   繁体   English

DynamoDB 复合主键在系统内移动的最佳实践(分区键和排序键)

[英]Best practice for DynamoDB composite primary key travelling inside the system (partition key and sort key)

I am working on a system where I am storing data in DynamoDB and it has to be sorted chronologically.我在一个系统上工作,我将数据存储在 DynamoDB 中,并且必须按时间顺序排序。 For partition_key I have an id (uuid) and for sort_key I have a date_created value.对于partition_key我有一个id (uuid) 而对于sort_key我有一个date_created值。 Now originally it was enough to save unique entries using only the ID, but then a problem arose that this data was not being sorted as I wanted, so a sort_key was added.现在最初只使用 ID 保存唯一条目就足够了,但是随后出现了一个问题,即这些数据没有按照我想要的方式进行排序,因此添加了一个 sort_key。

Using python boto3 library, it would be enough for me to get, update or delete items using only the id primary key since I know that it is always unique:使用 python boto3 库,我只使用 id 主键来获取、更新或删除项目就足够了,因为我知道它总是唯一的:

import boto3

resource = boto3.resource('dynamodb')
table = resource.Table('my_table_name')

table.get_item(
    Key={'item_id': 'unique_item_id'}
)
table.update_item(
    Key={'item_id': 'unique_item_id'}
)
table.delete_item(
    Key={'item_id': 'unique_item_id'}
)

However, DynamoDB requires a sort key to be provided as well, since primary keys are composed partition key and sort key.但是,DynamoDB 也需要提供排序键,因为主键由分区键和排序键组成。

table.get_item(
    Key={
        'item_id': 'unique_item_id',
        'date_created': 12345          # timestamp
    }
)

First of all, is it the right approach to use sort key to sort data chronologically or are there better approaches?首先,使用排序键按时间顺序对数据进行排序是正确的方法还是有更好的方法?

Secondly, what would be the best approach for transmitting partition key and sort key across the system?其次,在整个系统中传输分区键和排序键的最佳方法是什么? For example I have an API endpoint which accepts the ID, by this ID the backend performs a get_item query and returns the corresponding data.例如,我有一个接受 ID 的 API 端点,后端通过此 ID 执行get_item查询并返回相应的数据。 Now since I also need the sort key, I was thinking about using a hashing algorithm internally, where I would hash a JSON like this:现在因为我还需要排序键,我正在考虑在内部使用散列算法,在那里我会像这样散列一个 JSON:

{
    "item_id": "unique_item_id",
    "date_created": 12345
}

and a single value then becomes my identifier for this database entry.然后一个单一的值成为这个数据库条目的标识符。 I would then dehash this value before performing any database queries.然后我会在执行任何数据库查询之前对这个值进行去哈希处理。 Is this the approach common?这种方法常见吗?

First of all, is it the right approach to use sort key to sort data chronologically首先,使用排序键按时间顺序对数据进行排序是否正确?

Sort keys are the means of sorting data in DynamoDB.排序键是在 DynamoDB 中对数据进行排序手段。 Using a timestamp as a sort key field is the right thing to do, and a common pattern in DDB.使用时间戳作为排序键字段是正确的做法,也是 DDB 中的常见模式。

DynamoDB requires a sort key to be provided ... since primary keys are composed partition key and sort key. DynamoDB 需要提供排序键……因为主键由分区键和排序键组成。

This is true.这是真的。 However, when reading from DDB it is possible to specify only the partition key using the query operation (as opposed to the get_item operation which requires the full primary key).但是,从 DDB 读取时,可以使用查询操作指定分区键(与需要完整主键的get_item操作相反)。 This is a powerful construct that lets you specify which items you want to read from a given partition.这是一个强大的构造,可让您指定要从给定分区读取的项目。

You may want to look into KSUIDs for your unique identifiers.您可能需要查看 KSUID 以获得您的唯一标识符。 KSUIDs are like UUIDs, but they contain a time component. KSUID 类似于 UUID,但它们包含时间组件。 This allows them to be sorted by generation time.这允许它们按生成时间排序。 There are several KSUID libraries in python, so you don't need to implement the algorithm yourself. python中有几个KSUID库,不需要自己实现算法。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM