简体   繁体   English

AWS DynamoDB BOTO3 混淆扫描

[英]AWS DynamoDB BOTO3 Confusing Scan

Basically, if i loop a datetime performing an scan with date range per-day, like:基本上,如果我循环日期时间执行每天日期范围的扫描,例如:

table_hook = dynamodb_resource.Table('table1')

date_filter = Key('date_column').between('2021-01-01T00:00:00+00:00', '2021-01-01T23:59:59+00:00')

response = table_hook.scan(FilterExpression=date_filter)
incoming_data = response['Items']

if (response['Count']) == 0:
    return

_counter = 1

while 'LastEvaluatedKey' in response:
    response = table_hook.scan(ExclusiveStartKey=response['LastEvaluatedKey'])
    if ( 
        parser.parse(response['Items'][0]['date_column']).replace(tzinfo=None) < parser.parse('2021-01-01T00:00:00+00:00').replace(tzinfo=None) 
            or 
        parser.parse(response['Items'][0]['date_column']).replace(tzinfo=None).replace(tzinfo=None) > parser.parse('2021-06-07T23:59:59+00:00').replace(tzinfo=None) 
    ):
        break
        
    incoming_data.extend(response['Items'])
    _counter+=1
    print("|->   Getting page %s" % _counter)

At the end of Day1 to Day2 loop, it retrieve me X rows,在 Day1 到 Day2 循环结束时,它检索我 X 行,

But if i perform the same scan at the same way (paginating), with the same range (Day1 to Day2), without doing a loop, it retrieve me Y rows,但是,如果我以相同的方式(分页)以相同的范围(第 1 天到第 2 天)执行相同的扫描,而不进行循环,它会检索我 Y 行,

And to become better, when i perform a table.describe_table(TableName='table1'), row_count field comes with Z rows, i literally dont understand what is going on!为了变得更好,当我执行 table.describe_table(TableName='table1') 时,row_count 字段带有 Z 行,我真的不明白发生了什么!

Based on help of above guys, i found my error, basically i'm not passing the filter again when performing pagination so the fixed code are:根据上述人员的帮助,我发现了我的错误,基本上我在执行分页时不会再次通过过滤器,因此固定代码是:

table_hook = dynamodb_resource.Table('table1')

date_filter = Key('date_column').between('2021-01-01T00:00:00+00:00', '2021-01-01T23:59:59+00:00')

response = table_hook.scan(FilterExpression=date_filter)
incoming_data = response['Items']

_counter = 1

while 'LastEvaluatedKey' in response:
    response = table_hook.scan(FilterExpression=date_filter,
                ExclusiveStartKey=response['LastEvaluatedKey'])
        
    incoming_data.extend(response['Items'])
    _counter+=1
    print("|->   Getting page %s" % _counter)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM