简体   繁体   English

DynamoDB 数据建模 - 作为项目的分层数据结构

[英]DynamoDB Data Modeling - Hierarchical Data Structures as items

The access patterns that I'm interested in is the last item for a given exchange and an account name.我感兴趣的访问模式是给定交易所的最后一项和帐户名。

+-------------------------------+-------------------------------------------+---------+--------------------------------------------+------------+----------+------------+---------+-----------+----------------+--------------------------------------------------------------------+---------------------------+----------------------------------+--------------+------------------------------+
|              PK               |                    SK                     | Account |                  Address                   | AddressTag | Exchange | Instrument | Network | Quantity  | TransactionFee |                           TransactionId                            |       TransferDate        |            TransferId            | TransferType |          UpdatedAt           |
+-------------------------------+-------------------------------------------+---------+--------------------------------------------+------------+----------+------------+---------+-----------+----------------+--------------------------------------------------------------------+---------------------------+----------------------------------+--------------+------------------------------+
| Exchange#Binance#Account#main | TransferDate#12/17/2022 4:59:12 PM +02:00 | main    | 0xF76d3f20bF155681b0b983bFC3ea5fe43A2A6E3c | null       | Binance  | USDT       | ETH     | 97.500139 |            3.2 | 0x46d28f7d0e1e5b1d074a65dcfbb9d90b3bcdc7e6fca6b1f1f7abb5ab219feb24 | 2022-12-17T16:59:12+02:00 | 1b56485f6a3446c3b883f4f485039260 |            0 | 2023-01-28T20:19:59.9181573Z |
| Exchange#Binance#Account#main | TransferDate#12/17/2022 5:38:23 PM +02:00 | main    | 0xF76d3f20bF155681b0b983bFC3ea5fe43A2A6E3c | null       | Binance  | USDT       | ETH     | 3107.4889 |            3.2 | 0xbb2b92030b988a0184ba02e2e754b7a7f0f963c496c4e3473509c6fe6b54a41d | 2022-12-17T17:38:23+02:00 | 4747f6ecc74f4dd8a4b565e0f15bcf79 |            0 | 2023-01-28T20:20:00.4536839Z |
| Exchange#FTX#Account#main     | TransferDate#12/17/2021 5:38:23 PM +02:00 | main    | 0x476d3f20bF155681b0b983bFC3ea5fe43A2A6E3c | null       | FTX      | USDT       | ETH     |        20 |            3.2 | 0xaa2b92030b988a0184ba02e2e754b7a7f0f963c496c4e3473509c6fe6b54a41d | 2021-12-17T17:38:23+02:00 | 4747f6ecc74f4dd8a4b565e0f15bcf79 |            0 | 2023-01-28T20:20:00.5723855Z |
| Exchange#FTX#Account#main     | TransferDate#12/19/2022 4:59:12 PM +02:00 | main    | 0xc46d3f20bF155681b0b983bFC3ea5fe43A2A6E3c | null       | FTX      | USDT       | ETH     |        15 |            3.2 | 0xddd28f7d0e1e5b1d074a65dcfbb9d90b3bcdc7e6fca6b1f1f7abb5ab219feb24 | 2022-12-19T16:59:12+02:00 | 1b56485f6a3446c3b883f4f485039260 |            0 | 2023-01-28T20:20:00.5207119Z |
+-------------------------------+-------------------------------------------+---------+--------------------------------------------+------------+----------+------------+---------+-----------+----------------+--------------------------------------------------------------------+---------------------------+----------------------------------+--------------+------------------------------+

First of all, it seems to be working as expected but as I'm still learning I'm not so sure whether the partition key and the sort key I chose are good enough or not.首先,它似乎按预期工作,但由于我仍在学习,我不太确定我选择的分区键和排序键是否足够好。 This is important as "Uneven distribution of data due to the wrong choice of partition key" can cause reading/writing above the limit issues.这一点很重要,因为“由于分区键选择错误导致数据分布不均”可能导致读取/写入超出限制问题。

There was a similar example in the documentation and what they say about TransactionId being a partition key is as following: 文档中有一个类似的示例,他们说TransactionId是分区键的内容如下:

In most cases you won't use TransactionID for any query purposes, so you lose the ability to use the partition key to perform a fast lookup of data.在大多数情况下,您不会将 TransactionID 用于任何查询目的,因此您无法使用分区键来执行快速数据查找。 To expand this reasoning, consider the traditional order history view on an e-commerce site.要扩展此推理,请考虑电子商务网站上的传统订单历史视图。 Normally orders are retrieved by customer ID or Order ID, not a UID such as a transaction ID that was synthetically generated during checkout.通常订单是通过客户 ID 或订单 ID 检索的,而不是 UID,例如在结帐期间综合生成的交易 ID。 It's better to choose a natural partition key than generate a synthetic one that won't be used for querying.最好选择一个自然分区键,而不是生成一个不会用于查询的合成分区键。

Another interesting part of the documentation is about the composite key 文档的另一个有趣的部分是关于复合键

Composite sort keys let you define hierarchical (one-to-many) relationships in your data that you can query at any level of the hierarchy复合排序键让您可以在数据中定义层次结构(一对多)关系,您可以在层次结构的任何级别查询这些关系

[country]#[region]#[state]#[county]#[city]#[neighborhood]

This would let you make efficient range queries for a list of locations at any one of these levels of aggregation, from country, to a neighborhood, and everything in between.这将使您能够对这些聚合级别中的任何一个位置的列表进行高效的范围查询,从国家到社区,以及介于两者之间的所有内容。

I'm also interested in the "Get all user transfers by date range" access pattern but I'm not sure how I could achieve it.我也对“按日期范围获取所有用户传输”访问模式感兴趣,但我不确定如何实现它。 So here we are.所以我们到了。

C# implementation C# 实施

public async Task<UserTransferDto?> GetLastAsync(string exchange, string account)
{
    var queryRequest = new QueryRequest
    {
        TableName = TableName,
        KeyConditionExpression = "#pk = :pk",
        ExpressionAttributeNames = new Dictionary<string, string>
        {
            { "#pk", "PK" }
        },
        ExpressionAttributeValues = new Dictionary<string, AttributeValue>
        {
            { ":pk", new AttributeValue { S = $"Exchange#{exchange}#Account#{account}" } }
        },
        ScanIndexForward = false,
        Limit = 1
    };

    var response = await _dynamoDb.QueryAsync(queryRequest);
    if (response.Items.Count == 0)
    {
        return null;
    }

    var itemAsDocument = Document.FromAttributeMap(response.Items[0]);
    return JsonSerializer.Deserialize<UserTransferDto>(itemAsDocument.ToJson());;
}

public class UserTransferDto
{
    [JsonPropertyName("PK")]
    public string Pk => $"Exchange#{Exchange}#Account#{Account}";

    [JsonPropertyName("SK")]
    public string Sk => $"TransferDate#{TransferDate}";

    public required string Exchange { get; init; }

    public required string Account { get; init; }

    public required DateTimeOffset TransferDate { get; init; }

    public required string TransferId { get; init; }

    public required TransferType TransferType { get; init; }

    public required string Instrument { get; init; }

    public required string Network { get; init; }

    public required decimal Quantity { get; init; }

    public required string Address { get; init; }

    public string? AddressTag { get; init; }

    public decimal? TransactionFee { get; init; }

    public string? TransactionId { get; init; }

    public DateTime UpdatedAt { get; set; }
}

public enum TransferType
{
    Withdraw = 0,
    Deposit = 1
}

Sources:资料来源:

Your base table design works well for getting the latest item for a given exchange and account (via a Query with that as the PK and getting the last sortable from the SK), except that you're using non-sortable human time stamps instead of sortable time stamps.您的基表设计非常适用于获取给定交易所和帐户的最新项目(通过将其作为 PK 的查询并从 SK 获取最后一个可排序的项目),除了您使用的是不可排序的人工时间戳而不是可排序的时间戳。 You should use 2023-01-28 12:56:08 and so on so that the times sort right as strings.您应该使用 2023-01-28 12:56:08 等,以便时间按字符串正确排序。

For the other query to find the latest across all exchanges and accounts, you can create a GSI which has a singular PK and the times as the SK.对于其他查询以查找所有交易所和账户的最新信息,您可以创建一个具有单一 PK 和时间作为 SK 的 GSI。 Just beware that you're limited in how many writes per second you can do to the same PK.请注意,您每秒可以对同一 PK 执行的写入次数有限。 Above 1,000 write units per second you'll need to shard it and then do a query for each shard to get the latest per shard and then the latest overall.超过每秒 1,000 个写入单元,您需要对其进行分片,然后对每个分片进行查询以获取每个分片的最新数据,然后是最新的整体数据。

This is a pattern described in https://youtu.be/0iGR8GnIItQ这是https://youtu.be/0iGR8GnIItQ中描述的模式

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM