简体   繁体   English

DynamoDb batchGetItem 和分区键和排序键

[英]DynamoDb batchGetItem and Partition Key and Sort Key

I tried to use batchGetItem to return attributes of more then one item from a table but seems it works only with the combination of the partition key and range key, but what if I want to identify the requested items only by primary key ?我尝试使用batchGetItem从表中返回多个项目的属性,但似乎它仅适用于分区键和范围键的组合,但是如果我只想通过主键识别请求的项目怎么办? is the only way is to create the table without the range key ?唯一的方法是创建没有范围键的表吗?

    // Adding items
    $client->putItem(array(
        'TableName' => $table,
        'Item' => array(
            'id'     => array('S' => '2a49ab04b1534574e578a08b8f9d7441'),
            'name'   => array('S' => 'test1'),
            'user_name'   => array('S' => 'aaa.bbb')
        )
    ));

    // Adding items
    $client->putItem(array(
        'TableName' => $table,
        'Item' => array(
            'id'     => array('S' => '4fd70b72cc21fab4f745a6073326234d'),
            'name'   => array('S' => 'test2'),
            'user_name'   => array('S' => 'aaaa.bbbb'),
            'user_name1'   => array('S' => 'aaaaa.bbbbb')
        )
    ));

$client->batchGetItem(array(
    "RequestItems" => array(
        $table => array(
            "Keys" => array(
                // hash key
                array(
                    "id"  => array( 'S' => "2a49ab04b1534574e578a08b8f9d7441"),
                // range key
                    "name" => array( 'S' => "test1"),
                ),
                array(
                // hash key
                    "id"  => array( 'S' => "4fd70b72cc21fab4f745a6073326234d"),
                // range key
                    "name" => array( 'S' => "test2"),
                ),
            )
        )
    )
));

As per the official documentation:根据官方文档:

http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.Partitions.html http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.Partitions.html

If the table has a composite primary key (partition key and sort key), DynamoDB calculates the hash value of the partition key in the same way as described in Data Distribution: Partition Key—but it stores all of the items with the same partition key value physically close together, ordered by sort key value.如果表具有复合主键(分区键和排序键),DynamoDB 会按照数据分布:分区键中所述的相同方式计算分区键的哈希值——但它存储具有相同分区键的所有项目值物理上靠近在一起,按排序键值排序。

What are the advantages using Partition Key and Sort Key beside it stores all of the items with the same partition key value physically close together ?使用Partition Key and Sort Key除了将具有相同分区键值的所有项目物理靠近地存储在一起之外,还有什么优势?

As per the official documentation:根据官方文档:

A single operation can retrieve up to 16 MB of data, which can contain as many as 100 items.单个操作最多可以检索 16 MB 的数据,其中可以包含多达 100 个项目。 BatchGetItem will return a partial result if the response size limit is exceeded, the table's provisioned throughput is exceeded, or an internal processing failure occurs.如果超过响应大小限制、超过表的预配置吞吐量或发生内部处理失败,BatchGetItem 将返回部分结果。

How to handle the request if I need more then 100 items ?如果我需要超过 100 件物品,如何处理请求? just loop through all the items from the code and request each time 100 times or there is another way to achieve it via the AWS SDK DynamoDB?只需遍历代码中的所有项目并每次请求 100 次,还是有另一种方法可以通过 AWS SDK DynamoDB 实现?

Example of table creation:表创建示例:

$client->createTable(array(
        'TableName' => $table,
        'AttributeDefinitions' => array(
            array(
                'AttributeName' => 'id',
                'AttributeType' => 'N'      
            ),
            array(
                'AttributeName' => 'name',
                'AttributeType' => 'S'
            )
        ),
        'KeySchema' => array(
            array(
                'AttributeName' => 'id',
                'KeyType'       => 'HASH'
            ),
            array(
                'AttributeName' => 'name',
                'KeyType'       => 'RANGE'
            )
        ),
        'ProvisionedThroughput' => array(
            'ReadCapacityUnits'  => 5,
            'WriteCapacityUnits' => 5
        )
    ));

Thanks谢谢

UPDATE - Question to Mark B answer:更新 - 标记 B 的问题答案:

Yes you can create an index without a range key.是的,您可以创建没有范围键的索引。 The range key is entirely optional.范围键是完全可选的。 However, even if you have a range key defined it is optional to include it in your query.但是,即使您定义了范围键,也可以选择将其包含在查询中。 You can simply specify the hash key in your query to get all items with the hash key, which will be returned in an order based on the range key.您可以简单地在查询中指定散列键以获取具有散列键的所有项目,这些项目将根据范围键按顺序返回。

If I specify only the hash key in my query on a table with hash key and range key, I getting the below error, if I specify only the hash key in my query on a table without range key it works.如果我在具有散列键和范围键的表上的查询中仅指定散列键,则会出现以下错误,如果我在没有范围键的表上仅在查询中指定散列键,则它可以工作。 Please note the table without index.请注意没有索引的表格。

An uncaught Exception was encountered

Type:        Aws\DynamoDb\Exception\DynamoDbException
Message:     Error executing "BatchGetItem" on "https://dynamodb.eu-central-1.amazonaws.com"; AWS HTTP error: Client error: `POST https://dynamodb.eu-central-1.amazonaws.com` resulted in a `400 Bad Request` response:
{"__type":"com.amazon.coral.validate#ValidationException","message":"The provided key element does not match the schema" (truncated...)
 ValidationException (client): The provided key element does not match the schema - {"__type":"com.amazon.coral.validate#ValidationException","message":"The provided key element does not match the schema"}
Filename:    /var/app/vendor/aws/aws-sdk-php/src/WrappedHttpHandler.php

A lot of questions you've asked, so I'll try and break it down.你问了很多问题,所以我会试着把它分解。 (Sorry I can't answer the question with php code snippets) (对不起,我无法用 php 代码片段回答这个问题)

I tried to use batchGetItem to return attributes of more then one item from a table but seems it works only with the combination of the partition key and range key, but what if I want to identify the requested items only by primary key ?我尝试使用 batchGetItem 从表中返回多个项目的属性,但似乎它仅适用于分区键和范围键的组合,但是如果我只想通过主键识别请求的项目怎么办? is the only way is to create the table without the range key ?唯一的方法是创建没有范围键的表吗?

The BatchGetItem is the same as multiple GetItem calls. BatchGetItem 与多个 GetItem 调用相同。 Essentially, retrieve Zero or One items with each GetItem call.本质上,使用每个 GetItem 调用检索零个或一个项目。 You give it the unique key for the item you wish to retrieve (primary key).您为其提供要检索的项目的唯一键(主键)。 If your table has only Partition Key, then thats all you specify, otherwise Partition and Range key.如果您的表只有分区键,那么这就是您指定的全部,否则分区和范围键。 BatchGetItem batches GetItem calls up in one request to DynamoDB. BatchGetItem 在对 DynamoDB 的一个请求中批量调用 GetItem。

If you wish to query for multiple items for a given Partition Key, you want to look at the Query API .如果您希望查询给定分区键的多个项目,您需要查看Query API

What are the advantages using Partition Key and Sort Key beside it stores all of the items with the same partition key value physically close together ?使用 Partition Key 和 Sort Key 除了将具有相同分区键值的所有项目物理靠近地存储在一起之外,还有什么优势?

This is a difficult question to answer, as it heavily depends on the unique key of your data model.这是一个很难回答的问题,因为它在很大程度上取决于您的数据模型的唯一键。

Some advantages that come to mind are: 1. Sort Keys enable you to sort the data on that attribute (in Ascending or Descending order) 2. Sort keys have more comparison operations (ie: Greater than, Less Than, Between, Begins with, etc).想到的一些优点是: 1. 排序键使您能够对该属性的数据进行排序(按升序或降序) 2. 排序键具有更多的比较操作(即:大于、小于、介于、开始于、等)。 See docs查看文档

How to handle the request if I need more then 100 items ?如果我需要超过 100 件物品,如何处理请求? just loop through all the items from the code and request each time 100 times or there is another way to achieve it via the AWS SDK DynamoDB?只需遍历代码中的所有项目并每次请求 100 次,还是有另一种方法可以通过 AWS SDK DynamoDB 实现?

If you request more than 100 items, BatchGetItem will return a ValidationException with the message "Too many items requested for the BatchGetItem call".如果您请求超过 100 个项目,BatchGetItem 将返回带有消息“BatchGetItem 调用请求的项目太多”的 ValidationException。 You will need to loop through the items, 100 at a time to get all the items you need.您将需要循环遍历这些项目,一次 100 个以获取您需要的所有项目。 Keep in mind, there is also a size limit of 16MB, which means if any items are unprocessed, they will be returned in the response under "UnprocessedItems".请记住,还有 16MB 的大小限制,这意味着如果有任何项目未处理,它们将在“UnprocessedItems”下的响应中返回。

If DynamoDB returns any unprocessed items, you should retry the batch operation on those items.如果 DynamoDB 返回任何未处理的项目,您应该对这些项目重试批处理操作。 However, we strongly recommend that you use an exponential backoff algorithm.但是,我们强烈建议您使用指数退避算法。 If you retry the batch operation immediately, the underlying read or write requests can still fail due to throttling on the individual tables.如果您立即重试批处理操作,底层读取或写入请求仍可能因单个表的限制而失败。 If you delay the batch operation using exponential backoff, the individual requests in the batch are much more likely to succeed.如果您使用指数退避延迟批处理操作,批处理中的单个请求更有可能成功。

This documentation explains how to use it.文档说明了如何使用它。

but what if I want to identify the requested items only by primary key ?但是如果我只想通过主键识别请求的项目怎么办? is the only way is to create the table without the range key ?唯一的方法是创建没有范围键的表吗?

Yes you can create an index without a range key.是的,您可以创建没有范围键的索引。 The range key is entirely optional.范围键是完全可选的。 However, even if you have a range key defined it is optional to include it in your query.但是,即使您定义了范围键,也可以选择将其包含在查询中。 You can simply specify the hash key in your query to get all items with the hash key, which will be returned in an order based on the range key.您可以简单地在查询中指定散列键以获取具有散列键的所有项目,这些项目将根据范围键按顺序返回。

What are the advantages using Partition Key and Sort Key beside it stores all of the items with the same partition key value physically close together ?使用 Partition Key 和 Sort Key 除了将具有相同分区键值的所有项目物理靠近地存储在一起之外,还有什么优势?

The two fields combined are your primary key, which guarantees uniqueness.这两个字段组合是您的主键,它保证了唯一性。 The range/sort key also determines the order that results are returned in.范围/排序键还确定返回结果的顺序。

How to handle the request if I need more then 100 items ?如果我需要超过 100 件物品,如何处理请求?

From the documentation (emphasis mine):从文档(强调我的):

The maximum number of item attributes that can be retrieved for a single operation is 100. Also, the number of items retrieved is constrained by a 1 MB the size limit.单个操作可检索的最大项目属性数为 100。此外,检索的项目数受 1 MB 大小限制的约束。 If the response size limit is exceeded or a partial result is returned due to an internal processing failure, Amazon DynamoDB returns an UnprocessedKeys value so you can retry the operation starting with the next item to get.如果超出响应大小限制或由于内部处理失败而返回部分结果,Amazon DynamoDB 会返回一个UnprocessedKeys值,以便您可以从要获取的下一个项目开始重试该操作。

For example, even if you ask to retrieve 100 items, but each individual item is 50k in size, the system returns 20 items and an appropriate UnprocessedKeys value so you can get the next page of results.例如,即使您要求检索 100 个项目,但每个项目的大小为 50k,系统也会返回 20 个项目和适当的UnprocessedKeys值,以便您可以获得下一页结果。 If necessary, your application needs its own logic to assemble the pages of results into one set .如有必要,您的应用程序需要自己的逻辑来将结果页面组合成一组

So you would need to check the UnprocessedKeys value of the result and continue making requests in your application until there are no more UnprocessedKeys .因此,您需要检查结果的UnprocessedKeys值并继续在您的应用程序中发出请求,直到不再有UnprocessedKeys为止。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM