简体   繁体   中英

AWS DynamoDB BOTO3 Confusing Scan

Basically, if i loop a datetime performing an scan with date range per-day, like:

table_hook = dynamodb_resource.Table('table1')

date_filter = Key('date_column').between('2021-01-01T00:00:00+00:00', '2021-01-01T23:59:59+00:00')

response = table_hook.scan(FilterExpression=date_filter)
incoming_data = response['Items']

if (response['Count']) == 0:
    return

_counter = 1

while 'LastEvaluatedKey' in response:
    response = table_hook.scan(ExclusiveStartKey=response['LastEvaluatedKey'])
    if ( 
        parser.parse(response['Items'][0]['date_column']).replace(tzinfo=None) < parser.parse('2021-01-01T00:00:00+00:00').replace(tzinfo=None) 
            or 
        parser.parse(response['Items'][0]['date_column']).replace(tzinfo=None).replace(tzinfo=None) > parser.parse('2021-06-07T23:59:59+00:00').replace(tzinfo=None) 
    ):
        break
        
    incoming_data.extend(response['Items'])
    _counter+=1
    print("|->   Getting page %s" % _counter)

At the end of Day1 to Day2 loop, it retrieve me X rows,

But if i perform the same scan at the same way (paginating), with the same range (Day1 to Day2), without doing a loop, it retrieve me Y rows,

And to become better, when i perform a table.describe_table(TableName='table1'), row_count field comes with Z rows, i literally dont understand what is going on!

Based on help of above guys, i found my error, basically i'm not passing the filter again when performing pagination so the fixed code are:

table_hook = dynamodb_resource.Table('table1')

date_filter = Key('date_column').between('2021-01-01T00:00:00+00:00', '2021-01-01T23:59:59+00:00')

response = table_hook.scan(FilterExpression=date_filter)
incoming_data = response['Items']

_counter = 1

while 'LastEvaluatedKey' in response:
    response = table_hook.scan(FilterExpression=date_filter,
                ExclusiveStartKey=response['LastEvaluatedKey'])
        
    incoming_data.extend(response['Items'])
    _counter+=1
    print("|->   Getting page %s" % _counter)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM