简体   繁体   English

如何在 DynamoDB 中存储大型数组

[英]How to store large arrays in DynamoDB

I am new to DynamoDB and I am curious what is the best way to store potentially large arrays.我是 DynamoDB 的新手,我很好奇存储潜在大型数组的最佳方式是什么。

I have a user object that looks like this:我有一个看起来像这样的用户对象:

UserId: String
Watching: Card[]
Listings: Card[]

I am aware there is a limit to the size of objects in Dynamo - I think 1MB?我知道 Dynamo 中的对象大小是有限制的——我认为是 1MB? Therefore I think if a user had many listings it might go past this limit.因此,我认为如果用户有很多列表,它可能会超过这个限制。 What would be the best practice to store potential large arrays like this?存储这样的潜在大型数组的最佳做法是什么? Would it be to maybe store an array of CardIds and then do a second query to get cards from this?是否可能存储一个 CardIds 数组,然后进行第二次查询以从中获取卡片?

The limit of an object in DynamoDB is 400 KB , see DynamoDB Quotas . DynamoDB 中的对象限制为400 KB ,请参阅DynamoDB 配额

For larger attribute values AWS suggests compressing of the attribute in formats such as GZIP, and store it in binary in DynamoDB.对于较大的属性值,AWS 建议以 GZIP 等格式压缩属性,并将其以二进制形式存储在 DynamoDB 中。 Other option would be to store the item in JSON format in S3 and store the key of this file in DynamoDB.其他选项是以 JSON 格式将项目存储在 S3 中,并将此文件的密钥存储在 DynamoDB 中。

See: Best Practices for Storing Large Items and Attributes请参阅: 存储大型项目和属性的最佳实践

Probably, a third option would be to split your array somehow, and create multiple entries in DynamoDB.可能,第三种选择是以某种方式拆分您的数组,并在 DynamoDB 中创建多个条目。 Or try to create separate tables for separate attributes, obviously this wont solve the issue if for example Listings array's size is larger than the object limit itself.或者尝试为单独的属性创建单独的表,如果例如Listings数组的大小大于对象限制本身,这显然不会解决问题。

One option is to use the Single Table Design approach where all users, watchings, and listings are in the same table.一种选择是使用单表设计方法,其中所有用户、观看次数和列表都在同一个表中。

User items would have a primary key of user#uid and sk of user#uid (the same as pk), each watching item would have a pk of user#uid and sk of watching#wid , and each listing would have a pk of user#uid and sk of listing#lid .用户项目将有一个用户#uid 的主键和user#uid user#uid的 sk(与 pk 相同),每个观看项目将有一个user#uid的 pk 和watching#wid的 sk,并且每个列表都有一个 pk listing#lid #lid 的user#uid和 sk。

For example:例如:

pk PK sk sk other attributes其他属性
user#1用户#1 user#1用户#1 yes是的
user#2用户#2 user#2用户#2 yes是的
user#42用户#42 user#42用户#42 yes是的
user#42用户#42 watching#12观看#12 optional可选的
user#42用户#42 watching#29观看#29 optional可选的
user#42用户#42 listing#901清单#901 optional可选的
user#42用户#42 listing#472清单#472 optional可选的

This approach has no real limit on the number of watching or listing relationships.这种方法对观看或列出关系的数量没有真正的限制。

You can then query all items for a given user simply by issuing a query for pk=user#42 and that will yield the user and all associated watching and listing items (pagination notwithstanding).然后,您只需发出对pk=user#42的查询即可查询给定用户的所有项目,这将产生用户以及所有相关的观看和列表项目(尽管有分页)。 You can query all listings for a given user with pk=user#42 and sk begins_with("listing#") .您可以使用pk=user#42sk begins_with("listing#")查询给定用户的所有列表。

Note that this will increase the table size because of the additional "user", "watching", and "listing" prefixes on attribute values so you may want to consider abbreviating those.请注意,这会增加表的大小,因为属性值上有额外的“user”、“watching”和“listing”前缀,因此您可能需要考虑缩写这些前缀。

To quote Alex DeBrie:引用亚历克斯·德布里的话:

The main reason for using a single table in DynamoDB is to retrieve multiple, heterogenous item types using a single request.在 DynamoDB 中使用单个表的主要原因是使用单个请求检索多个异构项目类型。

The table in dynamoDB, every row in the table is object. dynamoDB 中的表,表中的每一行都是对象。

The item in a row view it in json object format.一行中的项目以 json 对象格式查看它。

The row cannot support json array.该行不支持 json 数组。

Split into 2 table.分成2张桌子。 Store listing in table A and another in table B. Design the primary key and sort key correctly so that it can identified the records belong to particular user.在表A中存储列表,在表B中存储列表。正确设计主键和排序键,以便它可以识别属于特定用户的记录。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM