简体   繁体   中英

How to use StartingToken with DynamoDB pagination scan

I have a DynamoDB table and I want to output items from it to a client using pagination. I thought I'd use DynamoDB.Paginator.Scan and supply StartingToken , however I dont see NextToken in the output of either page or iterator itself. So how do I get it?

My goal is a REST API where client requests next X items from a table, supplying StartingToken to iterate. Originally there's no token, but with each response server returns NextToken which client supplies as a StartingToken to get the next X items.

import boto3
import json
table="TableName"
client = boto3.client("dynamodb")
paginator = client.get_paginator("query")
token = None
size=1

for i in range(1,10):
    client.put_item(TableName=table, Item={"PK":{"S":str(i)},"SK":{"S":str(i)}})

it = paginator.paginate(
    TableName=table,
    ProjectionExpression="PK,SK",
    PaginationConfig={"MaxItems": 100, "PageSize": size, "StartingToken": token}
)

for page in it:
    print(json.dumps(page, indent=2))
    break

As a side note - how do I get one page from paginator without using break/for? I tried using next(it) but it does not work.

Here's it object:

{
'_input_token': ['ExclusiveStartKey'],
 '_limit_key': 'Limit',
 '_max_items': 100,
 '_method': <bound method ClientCreator._create_api_method.<locals>._api_call of <botocore.client.DynamoDB object at 0x000001CBA5806AA0>>,
 '_more_results': None,
 '_non_aggregate_key_exprs': [{'type': 'field', 'children': [], 'value': 'ConsumedCapacity'}],
 '_non_aggregate_part': {'ConsumedCapacity': None},
 '_op_kwargs': {'Limit': 1,
                'ProjectionExpression': 'PK,SK',
                'TableName': 'TableName'},
 '_output_token': [{'type': 'field', 'children': [], 'value': 'LastEvaluatedKey'}],
 '_page_size': 1,
 '_result_keys': [{'type': 'field', 'children': [], 'value': 'Items'},
                  {'type': 'field', 'children': [], 'value': 'Count'},
                  {'type': 'field', 'children': [], 'value': 'ScannedCount'}],
 '_resume_token': None,
 '_starting_token': None,
 '_token_decoder': <botocore.paginate.TokenDecoder object at 0x000001CBA5D81960>,
 '_token_encoder': <botocore.paginate.TokenEncoder object at 0x000001CBA5D82290>
}

And the page:

{
  "Items": [
    {
      "PK": {
        "S": "2"
      },
      "SK": {
        "S": "2"
      }
    }
  ],
  "Count": 1,
  "ScannedCount": 1,
  "LastEvaluatedKey": {
    "PK": {
      "S": "2"
    },
    "SK": {
      "S": "2"
    }
  },
  "ResponseMetadata": {
    "RequestId": "DBE4ON8SI0GOTS2RRO2OG43QJVVV4KQNSO5AEMVJF66Q9ASUAAJG",
    "HTTPStatusCode": 200,
    "HTTPHeaders": {
      "server": "Server",
      "date": "Fri, 30 Dec 2022 11:37:52 GMT",
      "content-type": "application/x-amz-json-1.0",
      "content-length": "121",
      "connection": "keep-alive",
      "x-amzn-requestid": "DBE4ON8SI0GOTS2RRO2OG43QJVVV4KQNSO5AEMVJF66Q9ASUAAJG",
      "x-amz-crc32": "973385738"
    },
    "RetryAttempts": 0
  }
}

I thought I could use LastEvaluatedKey but that throws an error, also tried to get token like this, but it did not work:

it._token_encoder.encode(page["LastEvaluatedKey"])

I also thought about using just scan without iterator, but I'm actually outputting a very filtered result-set. I need to set Limit to a very large value to get results and I don't want too many results at the same time. Is there a way to scan up to 1000 items but stop as soon as 10 items are found?

I would suggest not using paginator but rather just use the lower level Query . The reason being is the confusion between NextToken and LastEvaluatedKey . These are not interchangeable.

  • LastEvaluatedKey is passed to ExclusiveStartKey
  • NextToken is passed to StartToken

It's preferrable to use the Resource Client which I believe causes no confusing on how to paginate

import boto3

dynamodb = boto3.resource('dynamodb', region_name=region)

table = dynamodb.Table('my-table')

response = table.query()
data = response['Items']

# LastEvaluatedKey indicates that there are more results
while 'LastEvaluatedKey' in response:
    response = table.query(ExclusiveStartKey=response['LastEvaluatedKey'])
    data.update(response['Items'])

The LastEvaluatedKey is in the response object and can be set as the ExclusiveStartKey in the scan.

Sample code showing this can be found in the AWS DynamoDB Sample code ( here, for example )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM